当存在其他唯一字段时,为什么要使用自动递增主键?

发布于 2024-09-30 16:49:45 字数 624 浏览 8 评论 0原文

我正在学习一门名为“数据库系统”的课程,对于我们的课堂项目,我必须设计一个网站。

这是我创建的表的示例:

CREATE TABLE users
(
  uid INT NOT NULL AUTO_INCREMENT,
  username VARCHAR(60),
  passhash VARCHAR(255),
  email VARCHAR(60),
  rdate DATE,
  PRIMARY KEY(uid)
);

教授告诉我“uid”(用户 ID)完全无用且不必要,我应该使用用户名作为主键,因为没有两个用户可以拥有相同的用户名。

我告诉他使用用户 ID 对我来说很方便,因为当我调用 example.com/viewuser?id=5 之类的东西时,我只需使用以下命令检查参数:is_numeric($_GET[' id'])...不用说他不相信。

由于我在大量教程中看到了 user_id 和其他类似的属性(thread_id、comment_id 等)并查看了流行软件(例如 vbulletin)的数据库模式,因此肯定还有很多其他(更强有力的)原因。

所以我的问题是:与使用用户名等其他属性相比,您如何证明需要非空自动递增 id 作为主键?

I'm taking a course called "database systems" and for our class project I have to design a website.

Here's an example of a table I created:

CREATE TABLE users
(
  uid INT NOT NULL AUTO_INCREMENT,
  username VARCHAR(60),
  passhash VARCHAR(255),
  email VARCHAR(60),
  rdate DATE,
  PRIMARY KEY(uid)
);

The professor told me "uid" (user id) was completely useless and unnecessary and I should have used the username as the primary key, since no two users can have the same username.

I told him it was convenient for me use a user id because when I call something like example.com/viewuser?id=5 I just check the parameter with: is_numeric($_GET['id'])... needless to say he was not convinced.

Since I've seen user_id and other similar attributes (thread_id, comment_id, among others) on plenty of tutorials and looking at the database schema of popular software (eg. vbulletin) there must be plenty of other (stronger) reasons.

So my question is: How would you justify the need of a not null auto incrementing id as a primary key vs using another attribute like the username?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

凉城已无爱 2024-10-07 16:49:46

这通常称为代理键,它有很多好处。其中之一是将数据库关系与应用程序数据隔离。更多详细信息和相应的缺点可以在上面提供的 wiki 链接中找到。

This is typically called a surrogate key and it has many benefits. One of which is insulating your database relationships from the application data. More details and the corresponding disadvantages can be found at the wiki link provided above.

荒路情人 2024-10-07 16:49:46

因为有人可能想更改他们的用户名(或任何与此相关的名称)。

Because someone might want to change their username (or any name for that matter).

绮筵 2024-10-07 16:49:46

您的教授正在做正确的事情,他指出,如果要求用户名应该唯一,您应该使用户名唯一且不可为空。 uid 也可以是一个密钥,但除非您实际上在某个地方使用它,否则不需要它。设计中更重要的方面应该是实现自然键。所以我同意你教授的评论。

Your professor is doing the right thing by pointing out that you should have made username unique and not nullable if it was a requirement that user names should be unique. The uid could be a key as well but unless you are actually using it somewhere then it isn't needed. The more important aspect of the design ought to be to implement the natural key. So I agree with your professor's comment.

夏花。依旧 2024-10-07 16:49:46

我需要具有更多数据库知识的人来支持我,但我相信您在外键查找时间上会得到更快的响应。

此外,您稍后可能会决定希望更改用户名,或者用户名的要求可能会更改(可能是更长的字符串?)。使用 ID 可以避免更改所有外键。

让我们面对现实吧,大多数项目都不会扩展那么多,但是当您现在可以遵守良好的编程标准时,您真的愿意冒 12 个月后头痛的风险吗?

I'll need someone with more database knowledge to back me up on this one, but i believe you get a faster response in foreign key lookup time.

Additionally, you may decide later that you want usernames to change, or that the requirements for usernames may change (maybe a longer string?). Using an ID prevents having to change all foreign keys.

Lets face it, most projects aren't going to expand that much, but do you really want to risk the headache 12 months down the road, when you could conform to good programming standards now?

留一抹残留的笑 2024-10-07 16:49:46

例如,整数搜索 (?id=5) 比字符串搜索 (?username=bob) 要快得多,并且基数更高。
另一个例子,uid 是 auto_increment,因此您不必显式插入它,但它会在每次插入查询中自动递增。

PS:你的教授对此大错特错了:D

For instance, integer search (?id=5) is much way faster and has higher cardinality than string search (?username=bob).
Another example, uid is auto_increment, so you don't have to insert it explicitly but it will auto increment in each insert query.

PS: Your prof is soooo wrong about it :D

裸钻 2024-10-07 16:49:46

我们使用ID来防止重复数据,它可以使一些过程变得不复杂(如果我们要更新或删除数据),如果我们使用ID则更简单。

如果您不想使用 ID,您可以使用其他字段。但不要忘记让它们变得独一无二。它可以使您的数据免受重复数据的影响。

PRIMARY 之外的另一种方式是 UNIQUE。

we use ID to prevent duplication data and it can make some procces become not complicated (if we want to update or delete data), it more simple if we use ID.

if you dont want to use ID you can use another fields. but dont forget to make them become UNIQUE. it can make your data become preventive from duplication data.

another way outside PRIMARY is UNIQUE.

小ぇ时光︴ 2024-10-07 16:49:46

我同意上面所有的答案。我想说 ID 很容易实现,并且在索引方面,与 varchar 相比,Int 始终是首选。你的教授应该更清楚,他为什么要拒​​绝 Int id 高于我!

I go with all the answers above. I would say an ID is easy to implement and when it comes to indexing, Int is always preferred compared to a varchar. Your professor should know better, why would he say no to Int id is above me!

相对绾红妆 2024-10-07 16:49:46

因为userid应该是唯一的(不能重复)&有时是索引。

Because userid is supposed to be unique (cannot be duplicated) & sometimes is index.

懒的傷心 2024-10-07 16:49:46

您是否想以明文形式存储您的用户名以供任何人窃取?我永远不会考虑使用有一天我可能想要加密(或者现在想要加密)的自然密钥。

And do you want to store your usernames in clear text for any one to steal? I would never consider using a natural key that I might want to encrypt someday (or want to encrypt now).

只是一片海 2024-10-07 16:49:45

自动递增主键很有用,有几个原因:

  • 它们允许重复的用户名,就像 Stack Overflow 上一样
  • 它们允许更改用户名(或电子邮件地址,如果用于登录)(很容易)
  • 选择、联接和插入比varchar 主键,因为它维护数字索引的速度要快得多
  • 正如您所提到的,验证变得非常简单: if ((int)$id > 0) { ... }
  • 输入的卫生很简单: $id = (int)$_GET['id']
  • 由于外键不必复制潜在的大字符串值,因此开销要少得多,

我会说尝试使用任何字符串信息作为当自动递增数字键如此容易获得时,记录的唯一标识符是一个坏主意。

具有唯一用户名的系统对于极少数用户来说是很好的,但互联网使它们从根本上崩溃了。当您考虑到可能需要与网站交互的名为“john”的人数之多时,要求他们每个人都使用唯一的显示名称是荒谬的。它导致了我们经常看到的糟糕的系统,用随机的数字和字母装饰用户名。

然而,即使在强制使用唯一用户名的系统中,它仍然不是主键的糟糕选择。想象一个拥有 500 个帖子的用户:posts 表中的外键将包含重复 500 次的用户名。即使在您考虑到有人最终可能需要更改其用户名之前,开销也是令人望而却步的。

Auto-incrementing primary keys are useful for several reasons:

  • They allow duplicate user names as on Stack Overflow
  • They allow the user name (or email address, if that's used to login) to be changed (easily)
  • Selects, joins and inserts are faster than varchar primary keys as its much faster to maintain a numeric index
  • As you mentioned, validation becomes very simple: if ((int)$id > 0) { ... }
  • Sanitation of input is trivial: $id = (int)$_GET['id']
  • There is far less overhead as foreign keys don't have to duplicate potentially large string values

I would say trying to use any piece of string information as a unique identifier for a record is a bad idea when an auto-incrementing numeric key is so readily available.

Systems with unique user names are fine for very small numbers of users, but the Internet has rendered them fundamentally broken. When you consider the sheer number of people named "john" that might have to interact with a website, it's ridiculous to require each of them to use a unique display name. It leads to the awful system we see so frequently with random digits and letters decorating a username.

However, even in a system where you enforced unique usernames, it's still a poor choice for a primary key. Imagine a user with 500 posts: The foreign key in the posts table is going to contain the username, duplicated 500 times. The overhead is prohibitive even before you consider that somebody might eventually need to change their username.

动听の歌 2024-10-07 16:49:45

如果用户名是主键并且用户更改了他/她的用户名,则您将需要更新所有具有对用户表的外键引用的表。

If the username is the primary key and a user changes his/her username, you will need to update all the tables which have foreign key references to the users table.

八巷 2024-10-07 16:49:45

如果您已经向您的教授证明,为每个用户分配唯一的任意整数对您的应用程序有价值,那么他说这是“完全无用且不必要”的说法当然是错误的。

然而,也许你忽略了他的观点。如果他告诉您要求是“没有两个用户可以拥有相同的用户名”,那么您还没有满足该要求。

衷心感谢您发布 SQL DDL,它非常有用,但大多数人对此不屑一顾。

使用你的表,我可以这样做:

INSERT INTO users (username) VALUES (NULL);
INSERT INTO users (username) VALUES (NULL);
INSERT INTO users (username) VALUES (NULL);
INSERT INTO users (username) VALUES (NULL);
INSERT INTO users (username) VALUES (NULL);

结果如下:

SELECT uid, username, passhash, email, rdate 
FROM users;

uid   username   passhash   email   rdate
1     <NULL>     <NULL>     <NULL>  <NULL>
2     <NULL>     <NULL>     <NULL>  <NULL>
3     <NULL>     <NULL>     <NULL>  <NULL>
4     <NULL>     <NULL>     <NULL>  <NULL>

我认为这就是你的教授试图表达的观点:如果不对用户名强制使用自然键,你实际上没有任何数据完整性全部。

如果我是教授,我也会敦促您从设计中删除可为空的列。

If you have demonstrated to your professor that assigning a unique arbitrary integer to each user is of value to your application then of course he would be wrong to say that it is "completely useless and unnecessary".

However, maybe you missed his point. If he told you that the requirement is that "no two users can have the same username" then you haven't met that requirement.

Sincere thanks for posting your SQL DDL, it is very useful but most don't bother on SO.

Using your table, I can do this:

INSERT INTO users (username) VALUES (NULL);
INSERT INTO users (username) VALUES (NULL);
INSERT INTO users (username) VALUES (NULL);
INSERT INTO users (username) VALUES (NULL);
INSERT INTO users (username) VALUES (NULL);

Which results in this:

SELECT uid, username, passhash, email, rdate 
FROM users;

uid   username   passhash   email   rdate
1     <NULL>     <NULL>     <NULL>  <NULL>
2     <NULL>     <NULL>     <NULL>  <NULL>
3     <NULL>     <NULL>     <NULL>  <NULL>
4     <NULL>     <NULL>     <NULL>  <NULL>

I think is the point your professor was trying to make: without enforcing the natural key on username you don't really have any data integrity at all.

If I was the prof, I'd also urge you to remove nullable columns from your design.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文