在聚集索引上使用顺序 GUID 键插入的速度并没有明显加快
在 SQL Server 2008 中,我尝试重现顺序与非顺序 GUID 键上的聚集索引的实验结果,如下所示 http://sqlblog.com/blogs/denis_gobo/archive/2009 /02/05/11743.aspx 但我并没有经历到我所期望的插入速度的显着提升(以及作者的经历)。使用顺序 GUID 明显提高了页面利用率,但由于某些原因,插入 10,000 行仅快了大约 100 毫秒(总共 10,300 毫秒)。
我使用以下代码:
CREATE TABLE TestGuid1 (Id UNIQUEIDENTIFIER not null DEFAULT newid(),
SomeDate DATETIME, batchNumber BIGINT)
CREATE TABLE TestGuid2 (Id UNIQUEIDENTIFIER not null DEFAULT newsequentialid(),
SomeDate DATETIME, batchNumber BIGINT)
CREATE CLUSTERED INDEX ix_id1 ON TestGuid1(id)
CREATE CLUSTERED INDEX ix_id2 ON TestGuid2(id)
go
SET NOCOUNT ON
INSERT TestGuid1 (SomeDate,batchNumber) VALUES (GETDATE(),3)
go 10000
SET NOCOUNT ON
INSERT TestGuid2 (SomeDate,batchNumber) VALUES (GETDATE(),3)
go 10000
DBCC showcontig ('TestGuid1') WITH tableresults
DBCC showcontig ('TestGuid2') WITH tableresults
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate))
FROM TestGuid1
GROUP BY batchNumber
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate))
FROM TestGuid2
GROUP BY batchNumber
任何人都可以解释为什么我在 TestGuid2 上的插入没有经历更显着的加速吗?
后续: 按照下面线程中的要求,我扩展了测试:测试结果往往会随着时间的推移而发生显着变化,因此现在重复实验 N 次,并报告总时间和平均时间使用情况。我还添加了第三个测试,即连续整数列上的主键。这应该是所有三种方法中最快和最紧凑的,因为整数类型较小并且 IDENTITY(1,1) 速度(或至少应该)快。至少根据我的直觉。 现在,平均执行时间对顺序 GUID 有利,但令人惊讶的是,第三个实验中的插入(使用顺序整数键)比顺序 GUID 慢。对此我没有任何解释。 以下是新实验的代码:
SET NOCOUNT ON
CREATE TABLE TestGuid1 (Id UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID() PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(100))
CREATE TABLE TestGuid2 (Id UNIQUEIDENTIFIER NOT NULL DEFAULT NEWSEQUENTIALID() PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(100))
CREATE TABLE TestInt (Id Int NOT NULL identity(1,1) PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(100))
DECLARE @BatchCounter INT = 1
DECLARE @Numrows INT = 100000
WHILE (@BatchCounter <= 20)
BEGIN
BEGIN TRAN
DECLARE @LocalCounter INT = 0
WHILE (@LocalCounter <= @NumRows)
BEGIN
INSERT TestGuid1 (SomeDate,batchNumber) VALUES (GETDATE(),@BatchCounter)
SET @LocalCounter +=1
END
SET @LocalCounter = 0
WHILE (@LocalCounter <= @NumRows)
BEGIN
INSERT TestGuid2 (SomeDate,batchNumber) VALUES (GETDATE(),@BatchCounter)
SET @LocalCounter +=1
END
SET @LocalCounter = 0
WHILE (@LocalCounter <= @NumRows)
BEGIN
INSERT TestInt (SomeDate,batchNumber) VALUES (GETDATE(),@BatchCounter)
SET @LocalCounter +=1
END
SET @BatchCounter +=1
COMMIT
END
DBCC showcontig ('TestGuid1') WITH tableresults
DBCC showcontig ('TestGuid2') WITH tableresults
DBCC showcontig ('TestInt') WITH tableresults
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate)) AS [NEWID()]
FROM TestGuid1
GROUP BY batchNumber
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate)) AS [NEWSEQUENTIALID()]
FROM TestGuid2
GROUP BY batchNumber
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate)) AS [IDENTITY()]
FROM TestInt
GROUP BY batchNumber
DROP TABLE TestGuid1
DROP TABLE TestGuid2
DROP TABLE TestInt
以及平均执行时间:
NEWID() 3064
NEWSEQUENTIALID() 1977
IDENTITY() 2223
页面使用情况如下:
Table Pages AveragePageDensity
----------------------------------------
TestGuid1 50871 68,4
TestGuid2 35089 99,2
TestInt 32259 98,7
我不明白,为什么这些页面统计信息(最适合 TestInt)并不意味着实验三是最快的。
In SQL Server 2008 I have tried to reproduce the results from the experiments on clustered index on sequential vs. non-sequential GUID keys seen here
http://sqlblog.com/blogs/denis_gobo/archive/2009/02/05/11743.aspx
but I do not experience the significant speedup for insertions that I would expect (and the author experiences). The page utilization is clearly improved with the sequential GUID, but for some reasons, inserting 10,000 rows is only around 100 ms faster (out of 10,300 ms).
I use the following code:
CREATE TABLE TestGuid1 (Id UNIQUEIDENTIFIER not null DEFAULT newid(),
SomeDate DATETIME, batchNumber BIGINT)
CREATE TABLE TestGuid2 (Id UNIQUEIDENTIFIER not null DEFAULT newsequentialid(),
SomeDate DATETIME, batchNumber BIGINT)
CREATE CLUSTERED INDEX ix_id1 ON TestGuid1(id)
CREATE CLUSTERED INDEX ix_id2 ON TestGuid2(id)
go
SET NOCOUNT ON
INSERT TestGuid1 (SomeDate,batchNumber) VALUES (GETDATE(),3)
go 10000
SET NOCOUNT ON
INSERT TestGuid2 (SomeDate,batchNumber) VALUES (GETDATE(),3)
go 10000
DBCC showcontig ('TestGuid1') WITH tableresults
DBCC showcontig ('TestGuid2') WITH tableresults
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate))
FROM TestGuid1
GROUP BY batchNumber
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate))
FROM TestGuid2
GROUP BY batchNumber
Can anyone explain why I do not experience a more significant speedup for insertions on TestGuid2?
Follow-up:
As requested in the thread below, I have expanded the test: the test results tend to vary significantly over time, so now the experiments are repeated N times, and the total and average time usage reported. I have also added a third test, namely for primary keys on sequential integer columns. This should be the fastest and most compact of all three methods as the integer type is smaller and IDENTITY(1,1) is (or at least should be) fast. At least by my intuition.
The average execution time is now to the benefit of the sequential GUID, but surprisingly insertions in the third experiment (with sequential integer keys) is slower than sequential GUIDs. I have no explanation of this.
Here is the code for the new experiments:
SET NOCOUNT ON
CREATE TABLE TestGuid1 (Id UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID() PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(100))
CREATE TABLE TestGuid2 (Id UNIQUEIDENTIFIER NOT NULL DEFAULT NEWSEQUENTIALID() PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(100))
CREATE TABLE TestInt (Id Int NOT NULL identity(1,1) PRIMARY KEY,
SomeDate DATETIME, batchNumber BIGINT, FILLER CHAR(100))
DECLARE @BatchCounter INT = 1
DECLARE @Numrows INT = 100000
WHILE (@BatchCounter <= 20)
BEGIN
BEGIN TRAN
DECLARE @LocalCounter INT = 0
WHILE (@LocalCounter <= @NumRows)
BEGIN
INSERT TestGuid1 (SomeDate,batchNumber) VALUES (GETDATE(),@BatchCounter)
SET @LocalCounter +=1
END
SET @LocalCounter = 0
WHILE (@LocalCounter <= @NumRows)
BEGIN
INSERT TestGuid2 (SomeDate,batchNumber) VALUES (GETDATE(),@BatchCounter)
SET @LocalCounter +=1
END
SET @LocalCounter = 0
WHILE (@LocalCounter <= @NumRows)
BEGIN
INSERT TestInt (SomeDate,batchNumber) VALUES (GETDATE(),@BatchCounter)
SET @LocalCounter +=1
END
SET @BatchCounter +=1
COMMIT
END
DBCC showcontig ('TestGuid1') WITH tableresults
DBCC showcontig ('TestGuid2') WITH tableresults
DBCC showcontig ('TestInt') WITH tableresults
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate)) AS [NEWID()]
FROM TestGuid1
GROUP BY batchNumber
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate)) AS [NEWSEQUENTIALID()]
FROM TestGuid2
GROUP BY batchNumber
SELECT batchNumber,DATEDIFF(ms,MIN(SomeDate),MAX(SomeDate)) AS [IDENTITY()]
FROM TestInt
GROUP BY batchNumber
DROP TABLE TestGuid1
DROP TABLE TestGuid2
DROP TABLE TestInt
And the average execution times:
NEWID() 3064
NEWSEQUENTIALID() 1977
IDENTITY() 2223
The page usage is as follows:
Table Pages AveragePageDensity
----------------------------------------
TestGuid1 50871 68,4
TestGuid2 35089 99,2
TestInt 32259 98,7
I fail to see, why these page statistics (which are best for TestInt) doesn't imply that experiment three is fastest.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以尝试这个修改后的脚本并发布您的结果吗?
我看到各个运行之间的结果差异很大(在我的笔记本电脑上而不是服务器上!),但顺序运行的速度肯定是更快的趋势。
NEWID()
平均 5168.9NEWSEQUENTIALID()
平均 3000.85Can you try this modified script and post your results?
I see quite wildly varying results between individual runs (on my laptop not a server!) but a definite trend for sequential to be faster.
NEWID()
Average 5168.9NEWSEQUENTIALID()
Average 3000.85自从我写了那篇原始博客文章后,我决定运行你的代码,这就是我得到的
记住我是在具有 32 GB RAM 和 8 个进程的服务器上运行它,而不是
在本地计算机上的笔记本电脑上运行,我几乎看不到两者之间的区别
记住,除了插入之外,读取会慢得多,因为表是碎片的
这是我在服务器上运行 Martin 的脚本时得到的结果
这是我的桌面上发生的情况,顺便说一句,文件大小没有变化
Since I wrote that original blog post, I decided to run your code, here is what I get
Remember I am running this on a server with 32 GB of RAM and 8 procs, not on a laptop
on my local machine, I almost see no difference between the two
Remember, besides inserts, reads will be much slower because the table is fragmented
Here is what I get when running Martin's script on the server
Here is what happens on my desktop, files are not sized BTW