SQL Server - 字典的聚集索引设计
希望从中得到一些建议。我有一个表,我想在其中跟踪一个对象以及与该对象相关的键列表。示例:
OBJECTID ITEMTYPE ITEMKEY
-------- -------- -------
1 1 THE
1 1 BROWN
1 2 APPLE
1 3 ORANGE
2 2 WINDOW
OBJECTID 和ITEMKEY 都具有很高的选择性(即OBJECTID 和ITEMKEY 变化很大)。我的访问有两种方式:
通过 OBJECTID:每次对象更改时,键列表都会更改,因此需要基于 OBJECTID 的键。变化经常发生。
按ITEMKEY:用于关键字搜索,也经常发生。
所以我可能需要两个键,并选择一个作为聚集索引(访问更频繁的那个,或者我想要的速度,现在让我们假设我将优先考虑聚集的 OBJECTID)。我很困惑的是我应该如何设计它。
我的问题是,哪个更好:
a)(OBJECTID,ITEMTYPE,ITEMKEY)的聚集索引,然后是(ITEMKEY)的索引。我担心的是,由于聚集索引太大(2 个整数,1 个字符串),索引也会很大,因为所有索引项都必须指向聚集键。
b) 创建一个新列,以运行标识 DIRECTORYID(整数)作为主键和聚集索引,并声明两个索引为 (OBJECTID,ITEMTYPE,ITEMKEY) 和 (ITEMKEY)。这将最大限度地减少索引空间,但查找成本更高。
c) (OBJECTID,ITEMTYPE,ITEMKEY) 的聚集索引及其上的 (ITEMKEY,ITEMTYPE,OBJECTID) 的物化视图。我的逻辑是,这避免了键查找,并且仍然与 a) 中查找的索引一样大,但代价是更高的开销。
d) 呃...考虑到要求,也许有更好的方法?
提前致谢, 安德鲁
Would like some advice from this. I got a table where I want to keep track of an object and a list of keys related to the object. Example:
OBJECTID ITEMTYPE ITEMKEY
-------- -------- -------
1 1 THE
1 1 BROWN
1 2 APPLE
1 3 ORANGE
2 2 WINDOW
Both OBJECTID and ITEMKEY have high selectivity (i.e. the OBJECTID and ITEMKEY are very varied). My access are two ways:
By OBJECTID: Each time an object changes, the list of key changes so a key is needed based on OBJECTID. Changes happen frequently.
By ITEMKEY: This is for keyword searching and also happens frequently.
So I probably need two keys, and choose one for clustered index (the one that is more frequently accessed, or where I want the speed to be, for now lets assume i will prioritize OBJECTID for clustered). What I am confused about is how I should design it.
My questions is, which is better:
a) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and then an index of (ITEMKEY). My concern is that since a clustered index is so big (2 ints, 1 string) the index will be big, because all index items got to point back to the clustered key.
b) Create a new column with a running identity DIRECTORYID (integer) as primary key and clustered index, and declare two index for (OBJECTID,ITEMTYPE,ITEMKEY) and just (ITEMKEY). This will minimize index space but have higher lookup costs.
c) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and a materialized view of (ITEMKEY,ITEMTYPE,OBJECTID) on it. My logic is that this is avoids a key lookup and will still be just as big as the index with a lookup in a), at cost of higher overhead.
d) Err...maybe there is a better way given the requirements?
Thanks in advance,
Andrew
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果可能,请尝试使聚集键尽可能小,因为它也会添加到表上的所有非聚集索引中。
因此,如果可能的话,我会使用 INT,或者可能使用两个 INT 的组合 - 但绝对不会使用
VARCHAR
列 - 特别是如果该列可能很宽(> 10 个字符)并且绑定到改变。所以在你提出的选项中,我个人会选择 b) - 为什么?
添加代理
DirectoryID
将满足集群键的所有关键标准:,并且您的其他非集群索引将受到最小的影响。
请参阅 Kimberly Tripp 的杰出的博客文章介绍了在 SQL Server 表上选择良好集群键的主要标准 - 非常有用且具有启发性!
为了满足您的查询要求,我将添加两个非聚集索引,一个位于
ObjectID
上(可能包括其他经常需要的列),另一个位于ItemKey
上以按键名搜索。If ever possible, try to keep your clustered key as small as possible, since it will be also added to all non-clustered indices on your table.
Therefore, I would use an INT if ever possible, or possibly a combination of two INT - but certainly never a
VARCHAR
column - especially if that column is potentially wide (> 10 chars) and is bound to change.So of the options you present, I personally would choose b) - why??
Adding a surrogate
DirectoryID
will satisfy all crucial criteria for a clustering key:and your other non-clustered indices will be minimally impacted.
See Kimberly Tripp's outstanding blog post on the main criteria for choosing a good clustering key on your SQL Server tables - very useful and enlightening!
To satisfy your query requirements, I would add two non-clustered indices, one on
ObjectID
(possibly including other columns frequently needed), and another onItemKey
to search by keyname.