拥有包含聚集索引中的主键的非聚集索引是否不好?
如果您的表在主键 (int) 上有聚集索引,那么拥有一个(或多个)将该主键列包含为非聚集索引中的列之一的非聚集索引是否多余且不好? ?
If you have a table with a clustered index on the Primary Key (int), is it redundant and bad to have one (ore more) non-clustered indexes that include that primary key column as one of the columns in the non-clustered index?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
实际上,可能有充分的理由创建与聚集索引相同的非聚集索引。原因是聚集索引携带了行数据的包袱,这可能会导致行密度非常差。 IE。由于聚集键中没有宽字段,因此每页可以有 2-3 行,但聚集索引键只有 20 个字节。如果在与聚集索引完全相同相同的键和顺序上建立非聚集索引,则每页的密度为数百 2-3 个键。非聚集索引可以更有效地响应 OLAP/BI 工作负载的许多典型聚合查询,因为它可以将 I/O 减少数百倍。
至于包含部分聚集键的非聚集索引,甚至是相同的键但顺序不同的非聚集索引,那么所有的赌注都将被取消,因为它们显然可以用于多个查询。
所以你的问题的答案是:这取决于。
为了获得更精确的答案,您必须共享表的确切架构以及所涉及的确切查询。
Actually there could be valid reasons to create a non-clustered index identical with the clustered one. The reason is that clustered indexes carry the baggage of the row data and this can make very poor row density. Ie. you can have 2-3 rows per page due to wide fields that are not in the clustered key, but the clustered index key is only, say, 20 bytes. Having a non-clustered index on exactly the same key(s) and order as the clustered index would give a density of 2-3 hundreds of keys per page. A lot of aggregate queries typical for an OLAP/BI workload can be answered more efficiently by the non-clustered index, simply because it reduces the I/O by hundreds of times.
As for non-clustered indexes that contain parts of the clustered key, or even the same keys but in different order, then all bets are off as they obviously could be used for a multitude of queries.
So the answer to your question is: It Depends.
For a more precise answer you'll have to share the exact schema of your table(s) and the exact queries involved.
是的,通常没有必要,因为聚集索引的列已经添加到非聚集索引中的每个索引条目中。
为什么?聚集键的值真正允许SQL Server“查找”一行数据——它是指向实际数据的“指针”——所以显然,它必须存储在非聚集索引中。如果你查过“Smith, John”,并且需要更多地了解这个人,那么你需要去实际的数据-->这是通过将聚集键的值包含在非聚集索引的索引节点中来完成的。
该聚集键值已经存在,因此通常是多余的,没有必要再次显式地将该值添加到非聚集索引中。这样做的缺点是它只是浪费空间而不给你带来任何好处。
Yes, it is typically not necessary, because the columns of the clustered index are already added to each index entry in the non-clustered index.
Why? The value of the clustered key is what really allow SQL Server to "find" a row of data - it's the "pointer" to the actual data - so obviuosly, it has to be stored in the non-clustered index. If you have looked up "Smith, John" and you need to know more about this person, you need to go to the actual data --> and that is done by including the value of the clustering key in the index node of the non-clustered index.
That clustered key value is already there, and thus typically it's redundant and unnecessary to add that value again, explicitly, to your non-clustered index. It's bad in that it just simply wastes space without giving you any benefit.
我同意 Remus 的观点——聚集索引并不是真正的索引——它告诉你数据是如何在页面中组织的。 (在您的情况下,它也是主键,但不需要是同一件事)。非聚集索引包括行定位器信息,所以是的,它是多余的。
但是如果非聚集索引覆盖并且不需要使用数据行书签,则可以使用比聚集索引效率高很多,并且效率随着数据行大小与非聚集索引大小的比率的增加而增加。
我发现,如果您对查询工作负载中的访问路径有很好的掌握,有时一些选择性覆盖非聚集索引通常可以用来完全消除聚集选择 - 堆表、PK 和一些好的非聚集索引- 聚集索引,就完成了。
I'm with Remus on this - a clustered index is not really an index - it tells you how the data is organized in pages. (In your case, it's also the primary key, but that's not required to be the same thing). Non-clustered indexes include that row locator information, so yes, it is redundant.
But if a non-clustered index is covering and the data row bookmark doesn't need to be used, it can be used a lot more efficiently than the clustered index, and the efficiency increases as the ratio of the size of the data row to the size of the non-clustered index increases.
I've found that if you have a good handle on the access paths in your query workload, that sometimes a few selective covering non-clustered indexes often can be used to eliminate clustering choices completely - heap table, a PK, and some good non-clustered indexes, and you're done.
没有100%的答案,但答案几乎是肯定的。
其他索引用于帮助连接和排序(通常)。鉴于主键已经建立索引,如果优化器可以基于该主键进行连接,它将使用该主键。
如果从连接/排序的角度来看需要另一个索引,那么索引组合中的 PK 会提供哪些额外帮助?以前PK不能加入的话,现在也不能加入了。而且它也不会真正帮助任何排序。
There's no 100% answer, but the answer is almost definitely.
The other indexes are there to assist in helping with joins and sorting (generally). Given that the primary key is already indexed, if the optimizer can join based on that it'll use that.
If another index is needed from a join/sort perspective, what additional help does having the PK in the index mix provide? If it couldn't join based on the PK before, it's not going to now. And it's not really going to help any with sorting either.