如何选择我的主键?
我找到了有关选择主键的阅读材料。
- 是否有关于如何为给定表选择主键的指南/博客文章?
- 我应该使用自动递增/生成的键,还是应该将主键基于正在建模的数据(假设它具有真正唯一的字段)?
- 出于性能考虑,主键是否应该始终很长,或者我可以将外部唯一 id 作为主键,即使它是一个字符串?
I found this reading material on choosing a primary key.
- Is there a guide / blog post on how to choose the primary key for a given table?
- Should I use a auto-incremented/generated key, or should I base the primary key on the data being modeled (assuming it has a truly unique field)?
- Should the primary key always be long for performance's sake, or can I take an external unique id as primary key, even if it's a string?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我相信在实践中使用 自然键 很少比 代理键。
以下是使用自然键作为主键的主要缺点:
您可能有不正确的键值,或者您可能只是想重命名键值。要编辑它,您必须更新所有将其用作外键的表。
通常很难拥有真正的唯一自然密钥。
自然键通常是字符串。数字字段上的索引比字符串字段上的索引要紧凑得多。
对于主键的数据类型应该是什么,没有硬性规定。数字键通常性能更好,但您可以使用字符串,特别是当表不大并且引用它的表也不大时。
I believe that in practice using a natural key is rarely better than a surrogate key.
The following are the main disadvantages of using a natural key as the primary key:
You might have an incorrect key value, or you may simply want to rename a key value. To edit it, you would have to update all the tables that would be using it as a foreign key.
It is often difficult to have a truly unique natural key.
Natural keys are often strings. An index on an numeric field will be much more compact than one on a string field.
There is no hard rule on what the data type of the primary key should be. A numeric key normally performs better, but you could use a string, especially if the table is not big, and the tables that reference it are not big either.
密钥是一组具有两个基本特征的属性:唯一性和最小性。极简性意味着密钥仅具有确保唯一性所需的最少属性。
通常采用三个标准作为选择好密钥的指南:
这些都是好的指导方针,但不是绝对要求。在所有情况下,功能需求和数据完整性需求应决定使用哪些密钥。
A key is a set of attributes with two fundamental features: uniqueness and minimality. Minimality means the key has only the minimum number of attributes required to ensure uniqueness.
There are three criteria commonly applied as a guide to choosing a good key:
These are good guidelines but are not absolute requirements. In all cases functional requirements and the needs of data integrity should determine what keys to use.
我使用代理键,通常称为无意义键,由自动生成的 int/bigint 数据类型组成。
以下是我喜欢使用这些键的一些原因。
I use surrogate keys, often referred to as non-sensical keys, made up of an autogenerated int/bigint datatype.
Here are some of the reasons I like using these keys.
以下是一些有关主键的博客文章:
http://www.mysqlperformanceblog.com/2006/10/03/long-primary-key-for-innodb-tables/
http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid /
Here are a couple of blog posts about primary keys:
http://www.mysqlperformanceblog.com/2006/10/03/long-primary-key-for-innodb-tables/
http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/
我在专业系统(主要是银行软件)中使用过许多不同的数据模型,并且有不同的解决方案。我见过 GUID 解决方案,它似乎对性能没有太大影响。我已经看到“服务提供的号码作为系统范围的唯一号码”。我见过提供类似 GUID 之类“但更短”的算法。我还看到使用了业务密钥(如帐号),这是设计不佳并导致问题的,我不会推荐它。我已经看到每个表的自动递增键。
我最喜欢什么?服务提供的号码作为系统范围的号码。效果很好。通过简单的密钥转换表,我们可以使用用户密钥(如帐号)来找出唯一编号和数据对象的类型(不一定是表,因为如果数据对象相同,则相同的唯一密钥可能适用于多个表)根据其类型分为不同的表)。
那么有博客之类的吗?我有一本值得推荐的书,名为《Data Modeling Essentials》,作者是 Graeme Simsion 和 Graham Witt。他们可能不会建议我首选的解决方案,但他们给出了许多真实的例子,并展示了可能的不同类型的解决方案。
I have worked with a lot of different data models in professional systems (mostly bank software) and there were different solutions. There was the GUID solution I have seen and it seemed not to have impacted the performances too much. I have seen the "number provided by a service as a system wide unique number". I have seen algorithms of providing something like a GUID "but shorter". I have seen also that the business key was used (like the account number) which is poor design and caused problems and I would not recommend it. I have seen the auto-incremented key for each table.
What did I like the most? The number provided by a service as a system wide number. It works well. And with a simple key translation table one can use a user key (like an account number) to find out what unique number and what sort of data object (not necessarily the table because the same unique key may apply to several tables if a data object is split up on different tables depending on its type).
So is there a blog or something? Well I have a book to recommend called "Data Modeling Essentials" by Graeme Simsion and Graham Witt. They might not suggest my preferred solution but they give many real live examples and show the different kind of solutions that are possible.
我总是选择 uuid 作为主键。与 int/long key 相比,有一点开销,但有很多好处:您不会遇到类型溢出,您可以稍后对数据库进行分片而不更改主键,您可以与其他系统集成并确保你的主键总是唯一的,uuid 无法猜测等等。
I always choose uuid as a primary key. In comparison to int/long key, there is a slight overhead, but there are a lot of benefits: you cannot run into type overflow, you can shard database later on without changing primary keys, you can integrate with other systems and be sure that your primary keys are always unique, uuid cannot be guessed etc.