单列主键或多列主键的 SQL Server 性能差异?
如果主键是单列(例如,为每行生成一个 GUID)或多列(例如,外键 GUID + 偏移量),表的性能(在插入/更新和查询方面)是否有任何差异数字)?
我认为如果具有多列主键,查询速度应该更快,但是我认为由于稍微复杂的唯一检查,插入会更慢?我还想象多列主键的数据类型也可能很重要(例如,如果其中一列是 DateTime 类型,则会增加复杂性)。这些只是我的想法来调用答案和讨论(希望如此!)并且不是基于事实的。
Is there any difference in performance (in terms of inserting/updating & querying) a table if the primary key is a single column (e.g., a GUID generated for every row) or multiple columns (e.g., a foreign key GUID + an offset number)?
I would assume querying speeds should be quicker if anything with multi-column primary keys, however I would imagine inserting would be slower due to a slightly more complicated unique check? I also imagine the data types of a multi-column primary key could also matter (e.g., if one of the columns was a DateTime type it would add complexity). These are just my thoughts to invoke answers & discussion (hopefully!) and are not fact based.
I realise there are some other questions covering this topic, but I'm wondering about performance impacts rather than management/business concerns.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
与密钥中的组件数量相比,密钥的(每个)组件(a)可变长度和(b)宽度[宽而不是窄列]对您的影响更大。除非 MS 在最新版本中再次破坏了它(他们在 2005 年破坏了堆)。数据类型不会减慢速度;宽度,特别是可变长度(任何数据类型)。请注意,如果固定 len 列设置为 Nullable,则该列将变为可变。索引中的可变 len 列是个坏消息,因为每次访问都必须执行一些“解包”才能获取数据。
显然,保持索引列尽可能窄,使用固定列,而不是仅使用可空列。
就复合键中的列数而言,肯定一列比七列快,但也不是那么快:三个宽可变列比七个固定固定列慢得多。
GUID 当然是一个非常厚的密钥; GUID加上其他东西就非常非常丰富了; GUID Nullable 是吉尼斯材料。不幸的是,这是解决身份问题的下意识反应,而这又是没有选择良好的自然关系键的结果。所以建议您最好从源头上解决真正的问题,并选择好的自然键;避免身份;避免 GUID。
经验和性能调优,而不是猜测。
You will be affected more by (each) component of the key being (a) variable length and (b) the width [wide instead of narrow columns], than the number of components in the key. Unless MS have broken it again in the latest release (they broke Heaps in 2005). Datatype does not slow it down; the width, and particularly variable length (any datatype) does. Note that a fixed len column is made variable if it is set to Nullable. Variable len columns in indices is bad news, because a bit of "unpacking" has to be performed on every access, to get at the data.
Obviously, keep indexed columns as narrow as possible, using fixed, and not Nullable columns only.
In terms of number of columns in a compound key, sure one column is faster than seven, but not that much: three fat wide variable columns are much slower than seven thin fixed columns.
GUID is of course a very fat key; GUID plus anything else is very very fat; GUID Nullable is Guiness material. Unfortunately it is the knee-jerk reaction to solving the IDENTITY problem, which in turn is a consequence of not having chosen good natural relational keys. So you are best advised to fix the real problem at the source, and choose good natural keys; avoid IDENTITY; avoid GUID.
Experience and performance tuning, not conjecture.
这取决于您的访问模式、读/写比率以及是否(可能是最重要的)在主键上定义聚集索引。
经验法则是使主键尽可能小(32 位整数),并尽可能在单调递增的键上定义聚集索引(想想 IDENTITY),除非您的范围搜索构成针对该表的查询的很大一部分。
如果您的应用程序是写入密集型的,并且您在 GUID 列上定义聚集索引,则应注意:
所有非聚集索引都将
包含聚集索引键,因此会更大。如果有很多 NC 索引,这可能会对性能产生负面影响。
除非您使用“有序”
GUID(例如 COMB 或使用
NEWSEQUENTIALID()),您的插入
随着时间的推移,索引会变得碎片化。这意味着
您需要定期重建索引并且
可能会增加数量
页面中剩余的可用空间(填充
因素)
因为有很多因素在起作用(硬件、访问模式、数据大小),我建议您运行一些测试并对您的特定情况进行基准测试
It depends on your access patterns, read/write ratio and whether (possibly most importantly) the clustered index is defined on the Primary Key.
Rule of thumb is make your primary key as small as possible (32 bit int) and define the clustered index on a monotonically increasing key (think IDENTITY) where possible, unless you have range searches that form a large proportion of the queries against that table.
If your application is write intensive, and you define the clustered index on the GUID column you should note:
All non-clustered indexes will
contain the clustered index key and will therefore be larger. This may have a negative effect of performance if there are many NC indexes.
Unless you are using an 'ordered'
GUID (such as a COMB or using
NEWSEQUENTIALID()), your inserts
will fragment the index over time. This means
you need a regular index rebuild and
possibly increasing the amount of
free space left in pages (fill
factor)
Because there are many factors at work (hardware, access patterns, data size), I suggest you run some tests and benchmark your particular circumstances..
这取决于每种情况的索引和存储。在所有其他条件相同的情况下,主键的选择与性能无关。索引和其他存储选项的选择将是决定因素。
It depends on the indexing and storage in each case. All other things being equal, the choice of primary key is irrelevant as far as performance is concerned. The choice of indexes and other storage options would be the deciding factor.
如果您的情况需要更多数量的刀片,那么占地面积越小越好。
您需要区分两件事:数据库级别的主键概念和应用程序使用的键的概念。
为什么需要 GUID?您是否要插入多个数据库服务器,然后将信息合并到一个集中式数据库中?
如果是这种情况,那么我的建议是一个身份,然后是一个指导。标识上的聚集索引,GUID 上的唯一非聚集索引。如果您使用 GUID 作为聚集索引,那么您的数据插入将会到处都是。这意味着您的数据不会按顺序插入,这会导致性能问题,因为您的系统将随机插入和移动页面。
由于身份,将您的数据很好地插入有序派系中是可行的方法。您可以将排序留给索引结构(包含 GUID 的非聚集唯一),这是一种比使用表数据更有效的排序结构。
If your situation is going to be geared towards a higher number of inserts, then the smaller footprint possible, the better.
There are two things you need to separate, the concept of the primary key at the database level, and the concept of the key your application uses.
Why do you need a GUID? Are you going to be inserting into multiple database server, and then combining the information into one centralized database?
If that is the case then my recommendation is an identity followed by a guid. Clustered index on the identity, and Unique Non clustered on the GUID. If you use the GUID as a Clustered index, then your data inserts will be all over the place. Meaning your data will not be inserted sequentially, and this causes performance problems as your system will be inserting and moving pages around randomly.
Having your data inserted nice in an ordered faction, thanks to the identity, is the way to go. You can leave the sorting to the index structure( the nonclusered unique containing the GUID), which is a much more efficient structure to sort than using the table data.