(SQL Server) 为什么GUID或其他随机值作为高阶键会导致碎片?
我正在此处观看 Paul S Randal 的索引碎片视频。 Paul 在视频中表示,GUID 或其他随机值作为高阶键会导致页面分裂,进而导致索引碎片
。原因是,随机值意味着新记录将被插入到随机页中,这可能会导致随机页上的页分裂。我的问题:如果新记录没有插入随机页面,页面分裂仍然会发生,对吗?如果这是真的,这两种情况有什么区别?
顺便说一句,高位键是否意味着主键
?我不是以英语为母语的人,对此也不确定。
谢谢。
I'm watching Paul S Randal's index fragmentation video at here. In the video, Paul said that GUID or other random values as high-order key will cause page splits which in turn will cause index fragmentation
. The reason is that, random values means the new records will be inserted into random pages, which will probably cause page split on random page. My question: if the new records are not inserted into random pages, page split still can occur, right? If this is true, what's the difference in between the two cases?
BTW, does high-order key mean primary key
? I'm not a native English speaker and not sure of it.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
索引数据聚集在所使用的键上。这里提到“高阶”是因为索引可以是复合的,并且第一个成员对于顺序来说是最重要的。
因此,如果这个成员是随机的(并且 GUID 必须被视为随机的,即使加密货币专家不喜欢 GUID 这个术语),您将比插入连续值更频繁地进行分割,即使这些值是不在范围的末尾。
(这也是为什么在使用 GUID 作为聚集主键时,使用
NEWSEQUENTIALID()
而不是NEWID()
对插入性能有很大帮助)。The index data is clustered on the keys used. The "high-order" here is mentionned because indexes can be composite, and the first member is the most significant for the order.
So if this member is random (and a GUID has to be seen as random, even if crypto guys don't like this term for GUIDs) you'll get much more frequent splits than if you insert consecutive values, even if the values are not at the end of the range.
(That's also why using
NEWSEQUENTIALID()
instead ofNEWID()
helps a lot for the insert performance when using a GUID for a clustered primary key).如果您的键是连续的,那么数据将全部附加在末尾,这意味着将根据需要创建新页面。然而,这将创建一个“热点”,其中数据被推送到大容量系统中。解决方案是在某种分类字段上建立聚集索引,然后使用递增主键。
If your keys are sequential, then the data will all be appended at the end, meaning new pages will be created as necessary. This will however create a "hotspot" where data is being pushed in a high volume system. The solution to this is to have a clustered index across some kind of categorisation field, then the incrementing primary key.