SQL Server - 字典的聚集索引设计

发布于 2024-09-26 01:26:13 字数 1000 浏览 6 评论 0原文

希望从中得到一些建议。我有一个表,我想在其中跟踪一个对象以及与该对象相关的键列表。示例:

OBJECTID   ITEMTYPE   ITEMKEY
--------   --------   -------
1          1          THE
1          1          BROWN
1          2          APPLE
1          3          ORANGE
2          2          WINDOW

OBJECTID 和ITEMKEY 都具有很高的选择性(即OBJECTID 和ITEMKEY 变化很大)。我的访问有两种方式:

  • 通过 OBJECTID:每次对象更改时,键列表都会更改,因此需要基于 OBJECTID 的键。变化经常发生。

  • 按ITEMKEY:用于关键字搜索,也经常发生。

所以我可能需要两个键,并选择一个作为聚集索引(访问更频繁的那个,或者我想要的速度,现在让我们假设我将优先考虑聚集的 OBJECTID)。我很困惑的是我应该如何设计它。

我的问题是,哪个更好:

a)(OBJECTID,ITEMTYPE,ITEMKEY)的聚集索引,然后是(ITEMKEY)的索引。我担心的是,由于聚集索引太大(2 个整数,1 个字符串),索引也会很大,因为所有索引项都必须指向聚集键。

b) 创建一个新列,以运行标识 DIRECTORYID(整数)作为主键和聚集索引,并声明两个索引为 (OBJECTID,ITEMTYPE,ITEMKEY) 和 (ITEMKEY)。这将最大限度地减少索引空间,但查找成本更高。

c) (OBJECTID,ITEMTYPE,ITEMKEY) 的聚集索引及其上的 (ITEMKEY,ITEMTYPE,OBJECTID) 的物化视图。我的逻辑是,这避免了键查找,并且仍然与 a) 中查找的索引一样大,但代价是更高的开销。

d) 呃...考虑到要求,也许有更好的方法?

提前致谢, 安德鲁

Would like some advice from this. I got a table where I want to keep track of an object and a list of keys related to the object. Example:

OBJECTID   ITEMTYPE   ITEMKEY
--------   --------   -------
1          1          THE
1          1          BROWN
1          2          APPLE
1          3          ORANGE
2          2          WINDOW

Both OBJECTID and ITEMKEY have high selectivity (i.e. the OBJECTID and ITEMKEY are very varied). My access are two ways:

  • By OBJECTID: Each time an object changes, the list of key changes so a key is needed based on OBJECTID. Changes happen frequently.

  • By ITEMKEY: This is for keyword searching and also happens frequently.

So I probably need two keys, and choose one for clustered index (the one that is more frequently accessed, or where I want the speed to be, for now lets assume i will prioritize OBJECTID for clustered). What I am confused about is how I should design it.

My questions is, which is better:

a) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and then an index of (ITEMKEY). My concern is that since a clustered index is so big (2 ints, 1 string) the index will be big, because all index items got to point back to the clustered key.

b) Create a new column with a running identity DIRECTORYID (integer) as primary key and clustered index, and declare two index for (OBJECTID,ITEMTYPE,ITEMKEY) and just (ITEMKEY). This will minimize index space but have higher lookup costs.

c) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and a materialized view of (ITEMKEY,ITEMTYPE,OBJECTID) on it. My logic is that this is avoids a key lookup and will still be just as big as the index with a lookup in a), at cost of higher overhead.

d) Err...maybe there is a better way given the requirements?

Thanks in advance,
Andrew

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

甜妞爱困 2024-10-03 01:26:13

如果可能,请尝试使聚集键尽可能小,因为它也会添加到表上的所有非聚集索引中。

因此,如果可能的话,我会使用 INT,或者可能使用两个 INT 的组合 - 但绝对不会使用 VARCHAR 列 - 特别是如果该列可能很宽(> 10 个字符)并且绑定到改变。

所以在你提出的选项中,我个人会选择 b) - 为什么?

添加代理 DirectoryID 将满足集群键的所有关键标准:

  • 小、
  • 稳定、
  • 唯一、
  • 不断增加

,并且您的其他非集群索引将受到最小的影响。

请参阅 Kimberly Tripp 的杰出的博客文章介绍了在 SQL Server 表上选择良好集群键的主要标准 - 非常有用且具有启发性!

为了满足您的查询要求,我将添加两个非聚集索引,一个位于 ObjectID 上(可能包括其他经常需要的列),另一个位于 ItemKey 上以按键名搜索。

If ever possible, try to keep your clustered key as small as possible, since it will be also added to all non-clustered indices on your table.

Therefore, I would use an INT if ever possible, or possibly a combination of two INT - but certainly never a VARCHAR column - especially if that column is potentially wide (> 10 chars) and is bound to change.

So of the options you present, I personally would choose b) - why??

Adding a surrogate DirectoryID will satisfy all crucial criteria for a clustering key:

  • small
  • stable
  • unique
  • ever-increasing

and your other non-clustered indices will be minimally impacted.

See Kimberly Tripp's outstanding blog post on the main criteria for choosing a good clustering key on your SQL Server tables - very useful and enlightening!

To satisfy your query requirements, I would add two non-clustered indices, one on ObjectID (possibly including other columns frequently needed), and another on ItemKey to search by keyname.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文