SQL Server中如何选择聚集索引?
通常聚集索引是通过设置主键在 SQL Server Management Studio 中创建的,但是我最近关于 PK <-> 的问题是:聚集索引(Microsoft SQL Server 2008 主键的含义)表明没有必要将PK和聚集索引设置为相等。
那么我们应该如何选择聚集索引呢?让我们看以下示例:
create table Customers (ID int, ...)
create table Orders (ID int, CustomerID int)
我们通常会在两个 ID 列上创建 PK/CI,但我考虑为 CustomerID 中的订单创建它。这是最好的选择吗?
Usually the clustered index is created in SQL Server Management Studio by setting the primary key, however my recent question about PK <-> clustered index (Meaning of Primary Key to Microsoft SQL Server 2008) has shown that it is not necessary to set PK and clustered index to be equal.
So how should we choose clustered indexes then? Let's have the following example:
create table Customers (ID int, ...)
create table Orders (ID int, CustomerID int)
We would usually create the PK/CI on both ID columns but I thought about creating it for Orders in CustomerID. Is that the best choice?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
根据索引女王 - Kimberly Tripp 的说法 - 她在聚集索引中寻找的主要是:
如果您还可以保证:
,那么您就非常接近拥有您理想的聚类键!
查看她的整个 博客文章在这里,还有另一篇关于集群对表操作的关键影响的非常有趣的文章:聚集索引争论仍在继续。
任何像 INT(尤其是 INT IDENTITY)或可能的 INT 和 DATETIME 之类的东西都是理想的候选者。由于其他原因,GUID 根本不是好的候选者 - 因此您可能有一个 GUID 作为您的 PK,但不要将您的表聚集在它上面 - 它会变得支离破碎,无法识别,并且性能会受到影响。
According to The Queen Of Indexing - Kimberly Tripp - what she looks for in a clustered index is primarily:
And if you can also guarantee:
then you're pretty close to having your ideal clustering key!
Check out her entire blog post here, and another really interesting one about clustering key impacts on table operations here: The Clustered Index Debate Continues.
Anything like an INT (esp. an INT IDENTITY) or possibly an INT and a DATETIME are ideal candiates. For other reasons, GUID's aren't good candidates at all - so you might have a GUID as your PK, but don't cluster your table on it - it'll be fragmented beyond recognition and performance will suffer.
CLUSTERED
索引的最佳候选者是您最常用来引用记录的键。通常,这是一个
主键
,因为它是在搜索和/或外键
关系中使用的。在您的情况下,
Orders.ID
最有可能参与搜索和引用,因此它是作为聚类表达式的最佳候选。如果您在
Orders.CustomerID
上创建CLUSTERED
索引,则会发生以下情况:CustomerID
不唯一。为了确保唯一性,将向每条记录添加一个特殊的隐藏32 位
列,称为uniquifier
。表中的记录将根据这对列
(CustomerID, uniquifier)
存储。将创建
Order.ID
上的二级索引,以(CustomerID, uniquifier)
作为记录指针。像这样的查询:
<前><代码>选择*
来自订单
其中 ID = 1234567
必须执行外部操作,即
Clustered Seek
,因为并非所有列都存储在ID
上的索引中。要检索所有列,记录应首先位于聚集表中。此附加操作需要
IndexDepth
,因为读取的页面数量与简单的Clustered Seek
一样,IndexDepth
为O(log(n))< /code> 表中的记录总数。
A best candidate for a
CLUSTERED
index is the key you use to refer to your records most often.Usually, this is a
PRIMARY KEY
, since it's what used in searches and/orFOREIGN KEY
relationships.In your case,
Orders.ID
will most probably participate in the searches and references, so it is the best candidate for being a clustering expression.If you create the
CLUSTERED
index onOrders.CustomerID
, the following things will happen:CustomerID
is not unique. To ensure uniqueness, a special hidden32-bit
column known asuniquifier
will be added to each record.Records in the table will be stored according to this pair of columns
(CustomerID, uniquifier)
.A secondary index on
Order.ID
will be created, with(CustomerID, uniquifier)
as the record pointers.Queries like this:
will have to do an external operation, a
Clustered Seek
, since not all columns are stored in the index onID
. To retrieve all columns, the record should first be located in the clustered table.This additional operation requires
IndexDepth
as many page reads as a simpleClustered Seek
, theIndexDepth
beignO(log(n))
of total number of the records in your table.如果您担心集群,它通常是为了帮助改进数据检索。在您的示例中,您可能需要立即获取给定客户的所有记录。 customerID 上的集群会将这些行保留在同一物理页上,而不是分散在文件中的多个页面中。
ROT:对您想要显示的集合进行聚类。采购订单中的行项目就是一个典型的例子。
If you're concerned about clustering it's usually to help improve data retrieval. In you example, you're probably going to want all records for a given customer at once. Clustering on customerID will keep those rows on the same physical page rather than scattered throughout multiple pages in your file.
ROT: Cluster on what you want to show a collection of. Line items in a purchase order is the classic example.