主键:代码还是名称?
最常见的数据库表类型之一具有字母数字代码和人类友好的名称(例如国家、货币、帐户、产品、增值税代码等)。
传统/显而易见的做法是将代码作为主键。对于某些表,例如客户,其数量可能很大,因此名称可能不是唯一标识符,这显然是正确的做法。
但是像国家和货币这样的数量保证很小且名称保证唯一的东西呢?在这种情况下,引用几乎总是以人类友好的名称输入和显示。
在这种情况下,有什么理由不将名称设为主键吗?
One of the most common kinds of database table has an alphanumeric code and human friendly name (e.g. countries, currencies, accounts, products, VAT codes etc.)
The traditional/obvious thing to do is make the code the primary key. And for some tables e.g. customers where the number may be large so that names may not be unique identifiers, this is clearly the correct thing to do.
But what about things like countries and currencies where the number is guaranteed to be small and the names are guaranteed to be unique? In that case, references will almost always be both input and displayed with the human friendly name.
In that scenario, is there any reason not to make the name the primary key?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
永远、永远不要使用商业价值作为关键。
原因如下:无论您多么确定客户号码是不可变的,将来某个时候客户号码都必须更改(中国客户的号码是 4444,无论如何,这是非常不幸的)。如果你使用 4444 作为密钥,你不仅需要更改客户的密钥,还需要更改他的订单中的相关记录、他的地址等。
(有些人会认为这可以通过级联更新来解决,但是存在触发器时存在风险。)
最佳实践:创建代理键并将其命名为 ID(有些人更喜欢 CustomerID)。代理是对用户隐藏的密钥(因此得名),其唯一目的是提供唯一的密钥。这使您可以进行明确的连接和删除,而不必担心用户可能会更改什么。
每个表都应该有一个代理主键,它可以是一个自动递增整数,也可以是一个 GUID(根据数据库提供者的不同而有所不同)。
此规则只有一个允许的例外:创建 NN 关系时(例如,一个客户可以拥有多个地址,并且客户可以共享地址)。在这种情况下,可以使用 [CustomerID, AddressID] 对作为主键。
哦,最后,固定长度的整数/guid 上的连接比可变长度字符串上的连接快得多。
Never, ever, use a business value as a key.
Here's why: No matter how sure you are that the customer number is immutable, sometime in the future a customer number will have to changed (Chinese customer gets 4444, which is very unlucky, whatever). If you've used 4444 as a key, you'll have to change not only the customer's key, but also the related records in his orders, his addresses, etc.
(Some will argue that this can be resolved with cascading updates, but it's risky in the presence of triggers.)
Best practice: Create a surrogate key and call it ID (some prefer CustomerID). A surrogate is a key which is hidden from the users (thus its name) and whose only purpose is to provide a unique key. This allows you to make unambiguous joins and deletes without worrying about what users might change.
Every table should have exactly one surrogate primary key and it is either an auto-incremented integer or a GUID (varies according to the database provider).
There is only one allowable exception to this rule: when creating a N-N relation (e.g. one customer can have many addresses and addresses can be shared by customers). In this case it is acceptable to use the [CustomerID, AddressID] pair as the primary key.
Oh, and finally, joins on integers/guids, which are fixed-length are much faster than joins on varying length strings.
一些想法...
如果您要与其他应用程序交互,那么国家和货币的 ISO 标准会更容易被接受。
尽管此类数据或多或少是静态的...国家名称确实会发生变化,例如锡兰更改为斯里兰卡,罗德西亚更改为津巴布韦,因此您需要在许多地方进行更新,而不仅仅是查找时的描述桌子。
输入/显示并不总是使用友好的名称进行,例如用户对此感到满意的货币数据输入。
如果从列表中选择一个国家/地区,那么将其转换为引擎盖下的代码是相当简单的。
A few thoughts...
If you're interfacing with other applications then the ISO standards for countries and currencies would be better received.
Although such data is more or less static ... country names do change, e.g. Ceylon to Sri Lanka, Rhodesia to Zimbabwe, so you would need to update in many places rather than just the description on your lookup table.
Input/display are not always undertaken using the friendly name, e.g. data entry of currency where the users are comfortable with this.
If picking a country from a list then its fairly trivial to have this translated to a code under the bonnet.
如果您要在其他表中使用这些字段的值,那么您将违反数据库设计的规范化原则,特别是,您将在多个位置拥有相同的数据。这很糟糕,原因如下:
if you're going to be using the value of these fields in other tables, then you would be violating normalization principles of database design, specifically, you would have same data in multiple places. This is bad for several reasons: