SQL Server：按时间戳进行聚类；优点/缺点

发布于 2024-08-29 22:09:10 字数 1291 浏览 8 评论 0原文

我在 SQL Server 中有一个表，我希望将插入添加到表的末尾（而不是使用会导致它们插入到中间的聚集键）。这意味着我希望表由一些不断增加的列聚集。

这可以通过在 datetime 列上进行聚类来实现：

CREATE TABLE Things (
    ...
    CreatedDate datetime DEFAULT getdate(),
    [timestamp] timestamp,        

    CONSTRAINT [IX_Things] UNIQUE CLUSTERED (CreatedDate)
)

但我不能保证两个 Things 不会有相同的时间。所以我的要求并不能真正通过日期时间列来实现。

我可以添加一个虚拟身份< /strong> int 列，并对其进行聚类：

CREATE TABLE Things (
    ...
    RowID int IDENTITY(1,1),
    [timestamp] timestamp,        

    CONSTRAINT [IX_Things] UNIQUE CLUSTERED (RowID)
)

但是您会注意到我的表已经包含 时间戳列；保证单调递增的列。这正是我想要的候选簇键的特征。

因此，我将表聚集在 rowversion （又名 timestamp）列上：

CREATE TABLE Things (
    ...
    [timestamp] timestamp,        

    CONSTRAINT [IX_Things] UNIQUE CLUSTERED (timestamp)
)

而不是添加虚拟 identity int 列(RowID) 为了确保顺序，我使用已有的。

我正在寻找的是为什么这是一个坏主意的想法；以及还有哪些更好的想法。

注意：社区 wiki，因为答案是主观的。

原文

I have a table in SQL Server, where i want inserts to be added to the end of the table (as opposed to a clustering key that would cause them to be inserted in the middle). This means I want the table clustered by some column that will constantly increase.

This could be achieved by clustering on a datetime column:

CREATE TABLE Things (
    ...
    CreatedDate datetime DEFAULT getdate(),
    [timestamp] timestamp,        

    CONSTRAINT [IX_Things] UNIQUE CLUSTERED (CreatedDate)
)

But I can't guaranteed that two Things won't have the same time. So my requirements can't really be achieved by a datetime column.

I could add a dummy identity int column, and cluster on that:

CREATE TABLE Things (
    ...
    RowID int IDENTITY(1,1),
    [timestamp] timestamp,        

    CONSTRAINT [IX_Things] UNIQUE CLUSTERED (RowID)
)

But you'll notice that my table already constains a timestamp column; a column which is guaranteed to be a monotonically increasing. This is exactly the characteristic I want for a candidate cluster key.

So I cluster the table on the rowversion (aka timestamp) column:

CREATE TABLE Things (
    ...
    [timestamp] timestamp,        

    CONSTRAINT [IX_Things] UNIQUE CLUSTERED (timestamp)
)

Rather than adding a dummy identity int column (RowID) to ensure an order, I use what I already have.

What I'm looking for are thoughts of why this is a bad idea; and what other ideas are better.

Note: Community wiki, since the answers are subjective.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

叫思念不要吵 2024-09-05 22:09:10

所以我将表聚集在
rowversion（又名时间戳）列：
而不是添加一个虚拟身份
int 列（RowID）以确保顺序，
我使用已有的东西。

乍一看，这可能听起来是个好主意 - 但这实际上几乎是您拥有的最糟糕的选择。为什么？

聚集键的主要要求是（请参阅 Kim Tripp 的博客文章了解更多精彩详细信息）：

如果可能，稳定
狭窄
唯一
不断增加

您的 rowversion 违反了稳定要求，这可能是最重要的要求。行的行版本随着对行的每次修改而变化 - 并且由于您的聚集键被添加到表中的每个非聚集索引，因此您的服务器将不断更新非聚集索引的负载并浪费大量时间这样做。

最后，添加一个虚拟身份列对于您的情况可能是一个更好的替代方案。第二个最佳选择是 datetime 列 - 但在这里，您确实面临 SQL Server 在出现重复项时必须向您的条目添加“唯一符”的风险 - 并且精度为 3.33 毫秒，这可能肯定会发生 - 不是最佳的，但绝对比 rowversion 的想法好得多......

回复收藏 0 原文

一瞬间的火花 2024-09-05 22:09:10

来自链接：时间戳问题：

时间戳语法已弃用。
此功能将在
Microsoft SQL 的未来版本
服务器。避免在以下情况中使用此功能
新的开发工作，并计划
修改当前使用的应用程序
此功能。

和

重复的 rowversion 值可能是
使用 SELECT INTO 生成
语句，其中 rowversion 列
位于 SELECT 列表中。我们不
建议在此使用 rowversion
方式。

那么到底为什么要按其中任何一个进行聚类，特别是当行更新时它们的值 alwsys 会发生变化？只需使用一个身份作为 PK 并对其进行集群即可。

回复收藏 0 原文

So要识趣 2024-09-05 22:09:10

你已经走在正确的轨道上了。您可以使用保存创建日期的 DateTime 列并创建聚集但非唯一约束。

CREATE TABLE Things (
    ...
    CreatedDate datetime DEFAULT getdate(),
    [timestamp] timestamp,        
)

CREATE CLUSTERED INDEX [IX_CreatedDate] ON .[Things] 
(
    [CreatedDate] ASC
)

You were on the right track already. You can use a DateTime column that holds the created date and create a CLUSTERED but non unique constraint.

CREATE TABLE Things (
    ...
    CreatedDate datetime DEFAULT getdate(),
    [timestamp] timestamp,        
)

CREATE CLUSTERED INDEX [IX_CreatedDate] ON .[Things] 
(
    [CreatedDate] ASC
)

回复收藏 0 原文