将主键作为身份字段好吗
我读过很多关于我们是否应该拥有作为身份列的主键的文章,但我仍然很困惑。
使列具有同一性有很多优点,因为它可以在连接中提供更好的性能并提供数据一致性。但是与身份相关的一个主要缺点是,当 INSERT 语句失败时,IDENTITY 值仍然会增加。如果事务回滚,新的 IDENTITY 列值不会回滚,因此我们最终会在排序中出现间隙。我可以使用 GUID(通过使用 NEWSEQUENTIALID),但它会降低性能。
I have read a lot of articles about whether we should have primary keys that are identity columns, but I'm still confused.
There are advantages of making columns are identity as it would give better performance in joins and provides data consistency. But there is a major drawback associated with identity ,i.e.When INSERT statement fails, still the IDENTITY value increases If a transaction is rolled back, the new IDENTITY column value isn't rolled back, so we end up with gaps in sequencing. I can use GUIDs (by using NEWSEQUENTIALID) but it reduces performance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
差距应该不重要:身份栏是内部的,不供最终用户使用或识别。
由于 16 字节宽度,GUID 会降低性能,甚至是顺序性能。
在对数据进行建模并计算出自然键是什么之后,应选择一个身份列来尊重物理实现。也就是说,所选的自然键是逻辑键,但您选择代理键(身份),因为您知道引擎如何工作。
或者你使用 ORM 并让客户端摇动数据库狗......
Gaps should not matter: the identity column is internal and not for end user usage or recognition.
GUIDs will kill performance, even sequential ones, because of the 16 byte width.
An identity column should be chosen to respect the physical implementation after modelling your data and working out what your natural keys are. That is, the chosen natural key is the logical key but you choose a surrogate key (identity) because you know how the engine works.
Or you use an ORM and let the client tail wag the database dog...
出于所有实际目的,整数是主键的理想选择,而自动增量是生成它们的完美方法。只要您的 PK 毫无意义(替代),它将受到保护,不受客户创造力的影响,并且可以很好地满足其主要目的(识别表中的一行)。索引是打包的,连接速度很快,而且很容易对表进行分区。
如果您碰巧需要 GUID,那也没关系;但是,首先考虑自动递增整数。
For all practical purposes, integers are ideal for primary keys and auto increment is a perfect way to generate them. As long as your PK is meaningless (surrogate) it will be protected from creativity of you customers and serve its main purpose (to identify a row in a table) just fine. Indexes are packed, joins fast as it gets, and it is easy to partition tables.
If you happen to need GUID, that's fine too; however, think auto-increment integer first.
我想说这取决于你的需求。我们只使用Guids作为主键(默认设置为NewID),因为我们开发了一个包含许多Sql Server实例的分布式系统,所以我们必须确保每个Sql Server生成唯一的主键值。
但是,当使用 Guid 列作为 PK 时,请确保 不要将其用作聚集索引(感谢 marc_s 提供的链接)
Guid 类型的优点:
缺点:
数据一致性不是独立于数据类型的主键的问题,因为主键根据定义必须是唯一的!
我不认为身份列具有更好的连接性能。总而言之,性能取决于正确的指标。主键是约束而不是索引。
您是否需要有一个没有间隙的类型 int 的主键?正常情况下这应该不是问题。
I would like to say that depends on your needs. We use only Guids as primary keys (with default set to NewID) because we develop a distributed system with many Sql Server instances, so we have to be sure that every Sql Server generate unique primary key values.
But when using a Guid column as PK, be sure not to use it as your clustered index (thanks to marc_s for the link)
Advantage of the Guid type:
Disadvantage:
Dataconsistency is not an issue with primary keys independent of the datatype because a primary key has to be unique by definition!
I don't believe that an identity column has better join performance. At all, performance is a matter of the right indexes. A primary key is a constraint not an index.
Is your need to have a primary key of typ int with no gaps? This should'nt be a problem normally.
“是的,它完全降低了性能。我从 GUID 为 PK/CK 且每天都有 99.5% 索引碎片的遗留系统转为使用 INT IDENTITY - 巨大的差异。几乎不再有任何索引碎片,性能明显更好。GUID因为 SQL Server 表上的聚集索引很糟糕,很糟糕——句号。”
可能是真的,但我没有看到任何逻辑推理可以让我得出这样的结论:GUID PER SE 也很糟糕。
也许您应该考虑对此类数据使用其他类型的索引。如果您的 dbms 不提供多种索引类型之间的选择,那么也许您应该考虑为自己选择一个更好的 dbms。
"yes, it KILLS performance - totally. I went from a legacy system with GUID as PK/CK and 99.5% index fragmentation on a daily basis to using INT IDENTITY - HUGE difference. Hardly any index fragmentation anymore, performance is significantly better. GUIDs as Clustering Index on your SQL Server table are BAD BAD BAD - period."
Might be true, but I see no logical reasoning according to which this leads me to conclude that GUIDs PER SE are also BAD BAD BAD.
Maybe you should consider using other types of indexes on such data. And if your dbms does not offer you a choice between several types of index, then perhaps you should consider getting yourself a better dbms.