UniqueIdentifier 聚集索引上的 NewSequentialId

发布于 2024-11-27 17:02:12 字数 1381 浏览 2 评论 0原文

我正在为我公司即将启动的新数据库制定数据库标准。我们试图定义的事情之一是与唯一标识符相关的主键和聚集索引规则。

(注意:我不想讨论使用 UniqueIdentifier 作为主键或聚集索引的利弊。网络上有大量关于此的信息。这不是讨论。)

因此,这是让我担心的情况:

假设我有一个表,其中包含 UniqueIdentifier 作为聚集索引和主键。我们称之为 ColA。我将 ColA 的默认值设置为 NewSequentialId()。

使用 NewSequentialId() 我插入三个连续行:

{72586AA4-D2C3-440D-A9FE-CC7988DDF065}
{72586AA4-D2C3-440D-A9FE-CC7988DDF066}
{72586AA4-D2C3-440D-A9FE-CC7988DDF067}

然后我重新启动服务器。 NewSequentialId 的 文档 说“重新启动 Windows 后,GUID 可以启动再次处于较低范围,但仍然是全球独一无二的。”

所以下一个起点可以低于之前的范围。

因此,重新启动后,我又插入了 3 个值:

{35729A0C-F016-4645-ABA9-B098D2003E64}
{35729A0C-F016-4645-ABA9-B098D2003E65}
{35729A0C-F016-4645-ABA9-B098D2003E66}

(我不确定guid在数据库中的具体表示方式,但我们假设这个以 3 开头,前一个以 7 开头,因此这 3 个“较小”比 7 个。)

当您在聚集索引中间执行插入操作时,必须重新映射索引。 (至少我的 DBA 是这么告诉我的。)每次重新启动时,我都会冒着新的 UniqueIdentifier 范围位于其他先前范围中间的风险。

所以我的问题是:由于下一组 UniqueIdentifiers 将小于上一组,因此每次插入都会导致我的聚集索引打乱吗?

如果没有,为什么? SQL Server 是否知道我正在使用 NewSequentialId?它如何弥补这一点?

如果不是,那么它怎么知道我接下来要插入什么?也许接下来的一百万个插入将从 3 开始。或者也许他们将从 7 开始。它是怎么知道的?

或者它不知道,只是让一切井井有条。如果是这种情况,那么一次重新启动可能会严重影响性能。 (这让我觉得我需要自己的自定义 NewSequentialId,它不受重新启动的影响。)这是正确的吗?还是有什么我不知道的魔法?

编辑: GUID 作为聚集索引在我的标准中是强烈反对的。正如我上面所说,有很多原因表明这是一个坏主意。我正在尝试找出这是否是另一个原因。

I am working on database standards for a new database my company is starting. One of the things we are trying to define is Primary Key and Clustered Index rules in relation to UniqueIdentifiers.

(NOTE: I do not want a discussion on the pros and cons of using a UniqueIdentifier as a primary key or clustered index. There is a ton of info on the web about that. This is not that discussion.)

So here is the scenario that has me worried:

Say I have a table with a UniqueIdentifier as the clustered index and primary key. Lets call it ColA. I set the default value for ColA to be NewSequentialId().

Using that NewSequentialId() I insert three sequential rows:

{72586AA4-D2C3-440D-A9FE-CC7988DDF065}
{72586AA4-D2C3-440D-A9FE-CC7988DDF066}
{72586AA4-D2C3-440D-A9FE-CC7988DDF067}

Then I reboot my server. The docs for NewSequentialId say that "After restarting Windows, the GUID can start again from a lower range, but is still globally unique."

So the next starting point can be lower than the previous range.

So after the restart, I insert 3 more values:

{35729A0C-F016-4645-ABA9-B098D2003E64}
{35729A0C-F016-4645-ABA9-B098D2003E65}
{35729A0C-F016-4645-ABA9-B098D2003E66}

(I am not sure exactly how the guid is represented in the database, but lets assume since this one starts with 3 and the previous ones started with 7 that the 3 ones are "smaller" than the 7 ones.)

When you do an insert that is in the middle of a clustered index, a remapping of the index has to happen. (At least so my DBA has told me.) And every time I reboot I run the risk of having my new UniqueIdentifier range be right in the middle of other previous ranges.

So my question is: Since the next set of UniqueIdentifiers will be smaller than the last set, will every insert cause my clustered index to shuffle?

And if not, why? Does SQL Server know that I am using NewSequentialId? Does it some how compensate for that?

If not, then how does it know what I will insert next? Maybe the next million inserts will start with 3. Or maybe they will start with 7. How does it know?

Or does it not know and just keeps everything in order. If that is the case then one reboot could massively affect performance. (Which makes me think I need my own custom NewSequentialId that is not affected by reboots.) Is that correct? Or is there some magic I am not aware of?

EDIT: GUID as a clustered index is strongly discouraged in my standard. As I said above, there are many reasons that this is a bad idea. I am trying to find out if this is another reason why.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

沒落の蓅哖 2024-12-04 17:02:12

通常,您将使用适当的FILL FACTOR 创建索引,以便在所有页面中留出空白空间,以应对这种情况。话虽如此,一旦空白空间被填充,聚集索引就会重新排序。

我知道您不想讨论使用 GUID 作为聚集键,但这是不推荐这样做的原因之一。

将会发生的情况是,您页面拆分量将会增加,当您不断插入行时,这将导致非常高的碎片水平,并且您将需要以更高的频率重建索引以保持性能一致。

要全面讨论该主题,没有比

Kim
特里普的
博客

作为旁注,当如果您正在考虑创建自己的 NewSequentialID 创建函数,您可能遇到设计问题,应该重新考虑您的计划。

Normally you will create your indexes with an appropriate FILL FACTOR to leave empty space in all your pages for just such a scenario. That being said, the clustered index does get reordered once the empty space is filled.

I know you don't want to discuss using GUID as a clustered key, but this is one of the reasons that it's not a recommended practice.

What will happen is that you will have an increasing volume of page splits, which will lead to a very high level of fragmentation as you keep inserting rows, and you will need to rebuild your index at a higher frequency to keep performance in line.

For a full treatment on the topic, there's no better source than

Kim
Tripp's
Blog

As a side note, when you are considering creating your own NewSequentialID creation function, you probably have a design issue and should reconsider your plan.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文