UniqueIdentifier 聚集索引上的 NewSequentialId

发布于 2024-11-27 17:02:12 字数 1381 浏览 2 评论 0原文

我正在为我公司即将启动的新数据库制定数据库标准。我们试图定义的事情之一是与唯一标识符相关的主键和聚集索引规则。

（注意：我不想讨论使用 UniqueIdentifier 作为主键或聚集索引的利弊。网络上有大量关于此的信息。这不是讨论。）

因此，这是让我担心的情况：

假设我有一个表，其中包含 UniqueIdentifier 作为聚集索引和主键。我们称之为 ColA。我将 ColA 的默认值设置为 NewSequentialId()。

使用 NewSequentialId() 我插入三个连续行：

{72586AA4-D2C3-440D-A9FE-CC7988DDF065}
{72586AA4-D2C3-440D-A9FE-CC7988DDF066}
{72586AA4-D2C3-440D-A9FE-CC7988DDF067}

然后我重新启动服务器。 NewSequentialId 的文档说“重新启动 Windows 后，GUID 可以启动再次处于较低范围，但仍然是全球独一无二的。”

所以下一个起点可以低于之前的范围。

因此，重新启动后，我又插入了 3 个值：

{35729A0C-F016-4645-ABA9-B098D2003E64}
{35729A0C-F016-4645-ABA9-B098D2003E65}
{35729A0C-F016-4645-ABA9-B098D2003E66}

（我不确定guid在数据库中的具体表示方式，但我们假设这个以 3 开头，前一个以 7 开头，因此这 3 个“较小”比 7 个。）

当您在聚集索引中间执行插入操作时，必须重新映射索引。（至少我的 DBA 是这么告诉我的。）每次重新启动时，我都会冒着新的 UniqueIdentifier 范围位于其他先前范围中间的风险。

所以我的问题是：由于下一组 UniqueIdentifiers 将小于上一组，因此每次插入都会导致我的聚集索引打乱吗？

如果没有，为什么？ SQL Server 是否知道我正在使用 NewSequentialId？它如何弥补这一点？

如果不是，那么它怎么知道我接下来要插入什么？也许接下来的一百万个插入将从 3 开始。或者也许他们将从 7 开始。它是怎么知道的？

或者它不知道，只是让一切井井有条。如果是这种情况，那么一次重新启动可能会严重影响性能。（这让我觉得我需要自己的自定义 NewSequentialId，它不受重新启动的影响。）这是正确的吗？还是有什么我不知道的魔法？

编辑： GUID 作为聚集索引在我的标准中是强烈反对的。正如我上面所说，有很多原因表明这是一个坏主意。我正在尝试找出这是否是另一个原因。

原文

I am working on database standards for a new database my company is starting. One of the things we are trying to define is Primary Key and Clustered Index rules in relation to UniqueIdentifiers.

(NOTE: I do not want a discussion on the pros and cons of using a UniqueIdentifier as a primary key or clustered index. There is a ton of info on the web about that. This is not that discussion.)

So here is the scenario that has me worried:

Say I have a table with a UniqueIdentifier as the clustered index and primary key. Lets call it ColA. I set the default value for ColA to be NewSequentialId().

Using that NewSequentialId() I insert three sequential rows:

{72586AA4-D2C3-440D-A9FE-CC7988DDF065}
{72586AA4-D2C3-440D-A9FE-CC7988DDF066}
{72586AA4-D2C3-440D-A9FE-CC7988DDF067}

Then I reboot my server. The docs for NewSequentialId say that "After restarting Windows, the GUID can start again from a lower range, but is still globally unique."

So the next starting point can be lower than the previous range.

So after the restart, I insert 3 more values:

{35729A0C-F016-4645-ABA9-B098D2003E64}
{35729A0C-F016-4645-ABA9-B098D2003E65}
{35729A0C-F016-4645-ABA9-B098D2003E66}

(I am not sure exactly how the guid is represented in the database, but lets assume since this one starts with 3 and the previous ones started with 7 that the 3 ones are "smaller" than the 7 ones.)

When you do an insert that is in the middle of a clustered index, a remapping of the index has to happen. (At least so my DBA has told me.) And every time I reboot I run the risk of having my new UniqueIdentifier range be right in the middle of other previous ranges.

So my question is: Since the next set of UniqueIdentifiers will be smaller than the last set, will every insert cause my clustered index to shuffle?

And if not, why? Does SQL Server know that I am using NewSequentialId? Does it some how compensate for that?

If not, then how does it know what I will insert next? Maybe the next million inserts will start with 3. Or maybe they will start with 7. How does it know?

Or does it not know and just keeps everything in order. If that is the case then one reboot could massively affect performance. (Which makes me think I need my own custom NewSequentialId that is not affected by reboots.) Is that correct? Or is there some magic I am not aware of?

EDIT: GUID as a clustered index is strongly discouraged in my standard. As I said above, there are many reasons that this is a bad idea. I am trying to find out if this is another reason why.

分享到QQ

分享到微博