正确命名表的 id 字段

发布于 2024-10-04 08:31:22 字数 412 浏览 5 评论 0原文

我目前正在阅读 Joe Celko 写的《SQL 编程风格》一书。

在第一章的“开发标准化后缀”段落中,他对 id 列进行了说明:

“_id”= 标识符。它的独特之处在于 模式并引用一个实体 它出现在架构中的任何位置。 切勿使用“>table_name<_id”

几页后他说

不要使用下划线作为第一个 或名字的最后一个字母。看起来 就像这个名字少了另一个 组件。

他不推荐使用“id”作为列名。

所以我想知道你们如何命名 id 列?

我知道大多数人可能会认为这个问题有什么意义,但我正在考虑标准化我的数据模型,尽可能遵循行业标准和 ISO 标准。

I am currently reading the book "SQL Programming Style" wrote by Joe Celko.

In the first chapter, at the paragraph "Develop Standardized Postfixes" he states for the id column :

"_id" = identifier. It is unique in
the schema and refers to one entity
anywhere it appears in the schema.
Never user ">table_name<_id"

Few pages later he states

Do not use an underscore as the first
or last letter in a name. It looks
like the name is missing another
component.

He deprecated "id" as column name.

So I would like to know how you guys name the id column ?

I know that most people might think what the point of this question, but I am looking on standardizing my data model, following industry standards and ISO standards as much as I can.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

好听的两个字的网名 2024-10-11 08:31:22

我也不赞成使用“Id”作为列名,尽管它已经变得非常普遍。 “EmployeeId”比“Id”长,但更具描述性。它还允许外键通常与其引用的列具有相同的名称。当数据库的控制权从一个人转移到另一个人时,这非常有帮助。

上述情况有一个例外。同一个表中可能有两个外键,它们都引用同一个键。还可以有一个自反外键来引用它出现的同一表的不同行中的键。

让我举一个反身键的例子。您有一个员工表,其中包含 EmployeeId 键。您还有另一列,称为 SupervisorId,用于记录主管和多个下属之间的关系。在这种情况下,外键的名称命名角色,而不是实体。

作为替代方案,可以使用用户定义的域来记录两列引用同一事物的事实。同样,当必须将数据的基本含义传达给新人时,这是最有用的。

使用下划线作为符号内的内部视觉分隔符是一个完全可分离的问题。驼峰式命名法比下划线更普遍,甚至有些系统不允许下划线作为符号组成部分。

最重要的是,保持一致。如果你使用武断的、反复无常的、矛盾的命名约定,你最终甚至会迷惑你自己。

I also deprecate the use of "Id" as a column name, even though it has become very widespread. "EmployeeId" is longer than "Id", but it is more descriptive. It also allows a foreign key to generally have the same name as the column to which it refers. This is enormously helpful when control over the database passes from one person to the next.

There is an exception to the above. It's possible to have two foreign keys in the same table that both refer to the same key. It's also possible to have a reflexive foreign key that refers to the key in a different row of the same table where it appears.

Let me give an example of a reflexive key. You have a table of employees, with key EmployeeId. You have another column, called SupervisorId, that records the relationship between a supervisor and several subordintes. The name of the foreign key in this case names the role, and not the entity.

As an alternative, it's possible to use user defined domains to document the fact that two columns refer to the same thing. Again, this is most useful when the fundamental meaning of the data has to be communicated to someone new.

The use of underscore as an internal visual separator inside a symbol is a completely separable issue. Camelcasing has become more widespread than underscore, and there are even systems where underscore is not allowed as a symbol constituent.

Above all, keep it consistent. If you use arbitrary, capricious, and contradictory naming conventions, you'll eventually confuse even your self.

两仪 2024-10-11 08:31:22

我认为这是一个好问题。做你认为好的事情,并且每次都始终这样做。那你就没事了。

我使用表名+“id”模型:UserId、PersonId 等。

I think it's a good question. Do what looks good to you, and always do that, every time. Then you'll be fine.

I use the tablename + 'id' model: UserId, PersonId etc.

逐鹿 2024-10-11 08:31:22

我不会分享我对命名标准的看法,而是尝试回答你的问题;)

我认为 Celko 的观点是学生表中的 Student_ID 是一种代码味道,即设计者的风格可能是总是添加他们在物理模型中创建的每个表(即使逻辑模型中没有此类列)都有一个 ID 列(可能是自动增量列),目的是使用这些 ID 列作为外键。换句话说,Celko 不希望您始终使用代理键,而是希望您在适当的情况下使用自然键。

如果您继续阅读第 1.2.5 节(第 14-15 页)并遵循他的表名称规则,您将发现为什么表名称 + _ID 不太可能出现:

如果我找不到行业标准
(姓名),我想找一个集体
或类名...例外:使用
如果表实际上是单数名称
其中只有一行。

因此,例如,如果您有一个包含学生数据的表,它可能被称为“学生”而不是“学生”,但更有可能是“注册”(或类似的)。并且只包含一行的表不太可能需要 _ID 列。

我想有些名词的复数与单数相同,所以也许 Sheep_ID 是可以接受的(当然,只有在没有行业标准绵羊标识符的情况下!)

还要考虑规则 1.3.2。 (p19) 避免不同位置的名称发生变化,例如,在 Students 表中称为 ID 的域,在其他表中称为 Student_ID 的域。整个模式中不太可能只有一个名为 _ID 的元素!

Rather than share my opinions on naming standards, I'll attempt to answer your question ;)

I think the point Celko is making is that student_ID in a table of students is a code smell i.e. it could be that the designer's style is to always adds an ID column, probably an auto-increment column, to every table they create in the physical model (even when there is no such column in the logical model) with the intention of using these ID columns for foreign keys. In other words, Celko does not want you to always use surrogate key, rather he wants you to use natural keys where appropriate.

If you read on to section 1.2.5 (p14-15) and follow his rules for table names, you'll discover why table name + _ID an unlikely occurrence:

if I cannot find an industry standard
(name), I would look for a collective
or class name... Exception: use a
singular name if the table actually
has one and only one row in it.

So, for example, if you had a table containing student data it may be called Students rather than Student but more likely to be Enrolment (or similar). And a table containing one and only one row is unlikely to need an _ID column.

I suppose there are nouns for whom the plural is the same as the singular so maybe Sheep_ID is acceptable (but only in absence of an industry standard ovine identifier, of course!)

Also consider the rule 1.3.2. (p19) Avoid Names That Change From Place to Place e.g. the same domain referred to in the Students table as ID and in other tables as student_ID. It is unlikely that there will only be one element named _ID in the entire schema!

如梦亦如幻 2024-10-11 08:31:22

对于表 ID,我总是使用表名 + ID。

这样做的原因是为了避免查询中的列名不明确,当它是 1 到 1 映射时

有时我会快速编写 sql 来进行测试,如下所示

Select
  * 
FROM table1
Inner join table2 on table1ID = table2ID

如果我没有在 ID 列中使用表名,那么这会引发错误(迫使我使用表上的别名)

Select
  * 
FROM table1
Inner join table2 on ID = ID

也是使用表名的另一个很好的理由,通常测试查询以查看存在哪些数据,使用“*”来选择列。如果您执行连接和 Select *,有时很难理解哪个 ID 来自哪个表,特别是当您从 2 个以上的表返回大量列时

For Table IDs I always use tablename + ID.

The reason for this is to avoid ambiguous column names in Queries when it is a 1 to 1 mapping

Sometimes I quickly write up sql to test like this

Select
  * 
FROM table1
Inner join table2 on table1ID = table2ID

If I didnt use tablename in the ID column then this would throw an error (forcing me to use aliases on the tables)

Select
  * 
FROM table1
Inner join table2 on ID = ID

Also another good reason to use the table name, in general testing queries to see what data exists use the "*" to select columns. If you do a join and Select *, sometimes it is difficult to understand what ID came from what table, especially if you are returning a large number of columns from more than 2 tables

国际总奸 2024-10-11 08:31:22

我始终主张全球唯一的 TABLENAME_ID。在这一点上,我强烈鼓励表名充分描述其上下文,因此在进行外部引用时,它们的应用绝不会出现任何歧义。

I always advocate for globally unique TABLENAME_ID. On that note, I strongly encourage table names which fully describe their context, so there is never any ambiguity as to their application when foreign references are made.

坚持沉默 2024-10-11 08:31:22

ID 作为列名很难维护,而且在我看来更容易导致连接错误。

例如,假设您始终在每个表中使用 ID 作为列名。

现在假设您需要加入其中六个表。作为一个典型的人,您复制第一个连接并更改表名称。如果你错过了一个,并且使用了 id,你将得到一个运行的查询并给出错误的答案。如果您使用 tablenameId 您将收到语法错误。请参阅以下示例代码:

create table #test1 (id int identity, test varchar(10))
create table #test2 (id int identity, test varchar(10))
create table #test3 (id int identity, test varchar(10))

insert #test1
values ('hi')
insert #test1
values ('hello')
insert #test2
values ('hi there')
insert #test3
values ('hello')
insert #test3
values ('hi')
select * 
from #test1 t1
join #test2 t2
    on t1.id = t2.id
join #test3  t3
    on t1.id = t2.id    
select * 
from #test1 t1
join #test2 t2
    on t1.id = t2.id
join #test3  t3
    on t1.id = t3.id        

Drop table #test1
drop table #test2
drop table #test3   
Go

create table #test1 (t1id int identity, test varchar(10))
create table #test2 (t2id int identity, test varchar(10))
create table #test3 (t3id int identity, test varchar(10))   


    insert #test1
    values ('hi')
    insert #test1
    values ('hello')
    insert #test2
    values ('hi there')
    insert #test3
    values ('hello')
    insert #test3
    values ('hi')

select * 
from #test1 t1
join #test2 t2
    on t1.t1id = t2.t2id
join #test3 t3
    on t1.t1id = t3.t3id    

select * 
from #test1 t1
join #test2 t2
    on t1.t1id = t2.t2id
join #test3 t3
    on t1.t1id = t2.t3id    

    Drop table #test1
    drop table #test2
    drop table #test3   

使用 tablenameId 的另一件事是,当您需要复杂报告查询中多个表的实际 id 时,您不必创建别名来查看哪个 id 来自何处(以及让报告应用程序满意,因为他们中的大多数都坚持为报告使用唯一的字段名称)。

ID as a column name is hard to maintain and in my opinion can more easily lead to mistakes in joins.

Suppose for instance you always used ID as a column name in every table.

Now suppose you need to join to six of those tables. And being a typical person, you copy the first joins and change the table names. If you miss one, and you use id you will get a query that runs and gives the wrong anaswer. If you use tablenameId you will get a syntax error. see the following code for an example:

create table #test1 (id int identity, test varchar(10))
create table #test2 (id int identity, test varchar(10))
create table #test3 (id int identity, test varchar(10))

insert #test1
values ('hi')
insert #test1
values ('hello')
insert #test2
values ('hi there')
insert #test3
values ('hello')
insert #test3
values ('hi')
select * 
from #test1 t1
join #test2 t2
    on t1.id = t2.id
join #test3  t3
    on t1.id = t2.id    
select * 
from #test1 t1
join #test2 t2
    on t1.id = t2.id
join #test3  t3
    on t1.id = t3.id        

Drop table #test1
drop table #test2
drop table #test3   
Go

create table #test1 (t1id int identity, test varchar(10))
create table #test2 (t2id int identity, test varchar(10))
create table #test3 (t3id int identity, test varchar(10))   


    insert #test1
    values ('hi')
    insert #test1
    values ('hello')
    insert #test2
    values ('hi there')
    insert #test3
    values ('hello')
    insert #test3
    values ('hi')

select * 
from #test1 t1
join #test2 t2
    on t1.t1id = t2.t2id
join #test3 t3
    on t1.t1id = t3.t3id    

select * 
from #test1 t1
join #test2 t2
    on t1.t1id = t2.t2id
join #test3 t3
    on t1.t1id = t2.t3id    

    Drop table #test1
    drop table #test2
    drop table #test3   

Another thing about using tablenameId is that when you want the actual id from several tables in a complex reporting query, you don't have to create aliases in order to see which id came from where (and to make the reporting application happy as most of them inist on unique fieldnames for a report).

隔岸观火 2024-10-11 08:31:22

哇,我本来要写“我总是使用 TablenameID,但世界上其他人都不同意我的观点”。不过,看来这里的每个人都同意我的观点。

当然,就是当我在表中使用代理整数 ID 时。如果有一个自然主键,我会使用它。

Wow, I was going to write "I always use TablenameID but everyone else in the world disagrees with me". However, it looks like everyone here agrees with me.

That is, of course, when I use a surrogate integer ID in the table. If there's a natural primary key I use that instead.

偏爱自由 2024-10-11 08:31:22

在我的数据库中:

对于外键 ID,我使用单数版本的外部表名称 +“Id”。我使用大写的 I,小写的 d,因为这是外汇警察在我心中根深蒂固的标准。

对于自动递增身份,我经常使用“SequenceId”

在我的数据层中:

我使用对象名称+“Id”,遵循“Id”的最佳实践标准

In my database:

For a foreign key ID, I use the singular version of the foreign table name + "Id". I use the capital I, lower d as it is a standard ingrained in me by FX cop.

For auto incrementing identities I often use "SequenceId"

In my data layer:

I use the name of the object + "Id", following best practice standards for "Id"

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文