外键首选字符串还是整数?
我有一个包含 userid
和 username
列的用户表,并且两者都是唯一的。
userid
和 username
之间,哪个用作外键更好?为什么?
我的老板想使用字符串,可以吗?
I have a user table with userid
and username
columns, and both are unique.
Between userid
and username
, which would be better to use as a foreign key and why?
My Boss wants to use string, is that ok?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
一个 int 是 4 个字节,一个 string 可以有任意多个字节。因此,int 总是会表现得更好。当然,除非您坚持使用长度小于 4 个字符的用户名:)
此外,如果列中的数据本身可以更改,则永远不应该使用列作为 PK/FK。用户倾向于更改他们的用户名,即使您的应用程序中目前不存在该功能,也许几年后就会出现。当那一天到来时,您可能有 1000 个表引用该用户表,然后您必须更新事务中的所有 1000 个表,这很糟糕。
An int is 4 bytes, a string can be as many bytes as you like. Because of that, an int will always perform better. Unless ofcourse if you stick with usernames that are less than 4 characters long :)
Besides, you should never use a column as PK/FK if the data within the column itself can change. Users tend to change their usernames, and even if that functionality doesn't exist in your app right now, maby it will in a few years. When that day comes, you might have 1000 tables that reference that user-table, and then you'll have to update all 1000 tables within a transaction, and that's just bad.
int 索引速度会更快,可能是也可能不是问题,根据您提供的内容很难说
int will index faster, may or may not be an issue, hard to say based on what you have provided
这取决于外键:如果您的公司可以控制它,那么我建议在有 ID 字段的情况下使用 Int。但是,有时 ID 字段不在表中,因为另一个键可以作为备用唯一键。因此,在这种情况下,ID 字段可能是代理键。
经验法则:您的外键数据类型应该与您的主键数据类型匹配。
这里有一个例外:不属于您公司的外键怎么办?您无法控制的数据库和 API 的外键怎么办?在我看来,这些 ID 应该始终是字符串。
为了说服你,我问这些问题:
你在做数学计算吗?你在增加它吗?你能控制它吗? API 因变化而臭名昭著,甚至数据类型也可以在其他人的数据库中更改......那么当 int ID 变成十六进制时,它会给你带来多大的麻烦?
It depends on the foreign key: If your company has control over it, then I recommend using an Int if there is an ID field for it. However, sometimes an ID field is not on a table because another key makes sense as an alternate unique key. So, the ID field might be a surrogate key in that case.
Rule of thumb: Your foreign key data type should match your primary key data type.
Here's an exception: what about foreign keys that don't belong to your company? What about foreign keys to databases and APIs that you have no control over? Those IDs should always be strings IMO.
To convince you, I ask these questions:
Are you doing math on it? Are you incrementing it? Do you have control over it? APIs are notorious for change, even data types CAN be changed in someone else's database... so how much will it mess you up when an int ID becomes a hex?
这取决于
有很多现有的讨论关于自然键和代理键 - 您需要决定什么对您有用,以及您组织内的“标准”是什么。
在OP的情况下,同时存在代理键(
int userId
)和自然键(char
或varchar username
)。任一列都可以用作表的主键,无论哪种方式,您仍然能够强制另一个键的唯一性。以下是选择其中一种方式时的一些注意事项:
使用代理键的情况(例如 UserId INT AUTO_INCRMENT)
如果您使用代理(例如
UserId INT AUTO_INCRMENT
)作为主键,那么引用表MyUsers
的所有表都应使用UserId
作为外键。不过,您仍然可以通过使用额外的唯一索引来强制
username
列的唯一性,例如:根据@Dagon,使用窄主键(如
int
)比使用更宽(且可变长度)的值(如varchar
)具有性能和存储优势。此优点还会影响引用MyUsers
的其他表,因为userid
的外键将更窄(要获取的字节更少)。代理整数键的另一个好处是可以轻松更改用户名,而不会影响引用
MyUsers
的表。如果使用
username
作为自然键,而其他表通过username
耦合到MyUsers
,则更改用户名非常不方便(因为否则会违反外键关系)。如果需要在使用用户名
作为外键的表上更新用户名,则可以使用ON UPDATE CASCADE 需要保持数据完整性。使用自然键(即用户名)的情况
使用代理键的一个缺点是通过代理键引用
MyUsers
的其他表将需要JOIN<如果需要
Username
列,则返回到MyUsers
表。自然键的潜在好处之一是,如果查询仅需要引用MyUsers
的表中的Username
列,则不需要连接回MyUsers< /code> 检索用户名,这将节省一些 I/O 开销。
It depends
There are many existing discussions on the trade-offs between Natural and Surrogate Keys - you will need to decide on what works for you, and what the 'standard' is within your organisation.
In the OP's case, there is both a surrogate key (
int userId
) and a natural key (char
orvarchar username
). Either column can be used as a Primary key for the table, and either way, you will still be able to enforce uniqueness of the other key.Here are some considerations when choosing one way or the other:
The case for using Surrogate Keys (e.g. UserId INT AUTO_INCREMENT)
If you use a surrogate, (e.g.
UserId INT AUTO_INCREMENT
) as the Primary Key, then all tables referencing tableMyUsers
should then useUserId
as the Foreign Key.You can still however enforce uniqueness of the
username
column through use of an additional unique index, e.g.:As per @Dagon, using a narrow primary key (like an
int
) has performance and storage benefits over using a wider (and variable length) value likevarchar
. This benefit also impacts further tables which referenceMyUsers
, as the foreign key touserid
will be narrower (fewer bytes to fetch).Another benefit of the surrogate integer key is that the username can be changed easily without affecting tables referencing
MyUsers
.If the
username
was used as a natural key, and other tables are coupled toMyUsers
viausername
, it makes it very inconvenient to change a username (since the Foreign Key relationship would otherwise be violated). If updating usernames was required on tables usingusername
as the foreign key, a technique like ON UPDATE CASCADE is needed to retain data integrity.The case for using Natural Keys (i.e. username)
One downside of using Surrogate Keys is that other tables which reference
MyUsers
via a surrogate key will need to beJOIN
ed back to theMyUsers
table if theUsername
column is required. One of the potential benefits of Natural keys is that if a query requires only theUsername
column from a table referencingMyUsers
, that it need not join back toMyUsers
to retrieve the user name, which will save some I/O overhead.