1:1 或 1:0 关系的标准化

发布于 2024-10-20 08:26:50 字数 142 浏览 9 评论 0原文

当使用关系数据库并且您想要 3NF(您用英语称之为 3NF 吗?)时,您可以将 1:1 关系拉到一张表中。但如果配给比例是 1:0/1(/ 意思是“或”),会发生什么呢?

然后将它们分开以避免表格中出现空格?在这种情况下,将它们分开是有效的 3NF 吗?

when using relation databases and you want 3NF (do you call it 3NF in english?), then you pull 1:1 relationsships together into one table. But what happens if the rationship is 1:0/1 (/ meaning or)?

Then you keep them separated to avoid blank spaces in tables? Kepping them apart is valid 3NF in ths case?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

雪落纷纷 2024-10-27 08:26:50

根据您的问题和随后对 @paxdialbo 的回答的评论,我的理解是您想要一个存储可选属性的解决方案,其中有很多属性,同时避免 NULL。实现此目的有两种方法,第六范式 (6NF) 或 实体属性值 (EAV) 模型。

第六范式

这涉及创建特定于属性的表:

create table attributeName (
    id
    value
)

其中id 是外键,value 捕获该属性(例如社会安全号码)。缺少给定键的记录表示不存在。

实体属性值

现在正如您可以想象的那样,第六范式可以导致表扩散。 EAV 模型通过对多个属性使用类似的模型来解决问题,如下所示:

create table integerAttribute (
    name
    id
    value
)

name 列标识属性(例如“SocialSecurity”),尽管更复杂的实现是代替 name > 列,name 存储在单独的元数据表中,并通过外键引用。无论如何,这种方法意味着有不同数据类型的其他表(即datetimeAttributevarcharAttribute等)。

结论

真正需要思考的问题是您正在处理多少可选属性。如果相对较少,最简单的解决方案实际上是在主表上添加可选的 NULLable 列。 6NF 和 EAV 显着增加了复杂性和性能问题。通常,使用这些方法之一时所做的是将整个实体序列化为主表上的 CLOB,以简化常见读取(即通过主键),以避免多个 LEFT 连接来检索完全水合的实体。

Based on your question and subsequent comments on @paxdialbo's answer my understanding is you want a solution for storing optional attributes, of which there are many, while avoiding NULLs. Two ways of accomplishing this, 6th Normal Form (6NF) or an Entity Attribute Value (EAV) model.

6th Normal Form

This involves creating a table specific to the attribute:

create table attributeName (
    id
    value
)

Where id is a foreign key and value captures that attribute (e.g. social security number). Absence of a record for a given key indicates non-existence.

Entity Attribute Value

Now as you can imagine, 6th Normal Form can lead to table proliferation. An EAV model solves by using similar model for multiple attributes as such:

create table integerAttribute (
    name
    id
    value
)

The name column identifies the attribute (e.g. 'SocialSecurity'), albeit a more sophisticated implementation is instead of a name column, name is stored in a separate meta data table and referenced via a foreign key. Regardless, this approach implies have other tables for different data types (i.e. datetimeAttribute, varcharAttribute, etc...).

Conclusion

The real question to ponder is how many optional attributes you're dealing with. If relatively few, the easiest solution is actually adding optional NULLable columns on the main table. 6NF and EAV add significant complexity and performance concerns. Often what's done when using one of these approaches is serializing the overall entity into a CLOB on the main table to simplify the common read (i.e. by primary key), to avoid multiple LEFT joins to retrieve a fully hydrated entity.

如何视而不见 2024-10-27 08:26:50

第三范式基本上意味着属性(或列)依赖于键、整个键并且除了键之外什么都没有(所以帮助我,Codd)。

如果您有一个属性,该属性要么存在,要么不存在,该属性本身仍然可能遵循规则。

在这些情况下,我只需将属性保留在主表中,并使它们可以为空,以指示它们是否适合该行。

通过(人为的)示例,您可能有一个 SocialSecurityNumber 属性作为您的主键(我不会讨论这是否是一个好主意或者您是否应该使用代理项)关键,因为它与问题无关)。

进一步假设您有一个独特的 BankAccount 属性用于支付工资,并且您不是那些可以将工资分配到多个银行账户以避税的好雇主之一:- )

现在某人的银行帐户完全取决于所选密钥,但并非每个人都可能拥有一把(他们可能会以现金支付)。换句话说,正如您所说,这是一个经典的 1:0/1 案例。

在这种情况下,您只需将表中的银行帐号设置为空即可。

Third normal form basically means that an attribute (or column) depends on the key, the whole key and nothing but the key (so help me, Codd).

If you have an attribute which is either there or not there, that attribute itself may still follow the rules.

In those cases, I would simply keep the attributes in the main table and make them nullable to indicate whether or not they're appropriate for the row.

By way of (contrived) example, you may have a SocialSecurityNumber attribute as your primary key (I won't go into the arguments as to whether this is a good idea here or whether you should use a surrogate key since it's irrelevant to the question).

Further assume that you have a distinct BankAccount attribute for paying their wage into, and that you're not one of those nice employers that can distribute a wage to multiple bank accounts for the purpose of dodging taxes :-)

Now the bank account of someone is dependent entirely on the chosen key but not everyone may possess one (they may be paid in cash). In other words, a classic 1:0/1 case as you put it.

In that case, you would simply make the bank account number nullable in the table.

楠木可依 2024-10-27 08:26:50

基于您对 paxdiablo 的评论。 。 。

让我们看一些 SQL。我本可以为这些列选择更好的名称,但我故意没有这样做。我并没有偷懒;我有充分的理由。外部谓词是用户应该如何解释表的内容。

-- External predicate: Human is identified by 
--                         Social Security Account Number [ssan] 
--                     and has full name [full_name] 
--                     and result of last HIV test [hiv_status]
--                     and has checking account [bank_account]
--                     and was born at exactly [birth_date].
--
create table human (
ssan char(9) primary key, 
full_name varchar(35) not null,
hiv_status char(3) not null default 'Unk' 
    CHECK (hiv_status in ('Unk', 'Pos', 'Neg')),
bank_account varchar(20),
birth_date timestamp not null
);

-- External predicate: Human athlete identified by 
--                         Social Security Account Number [ssan] 
--                     has current doping status [doping_status]
create table athlete (
ssan char(9) not null primary key references human (ssan),
doping_status char(3) not null default 'Unk' 
    CHECK (doping_status in ('Unk', 'Pos', 'Neg'))
);

-- External predicate: Human dictator identified by 
--                         Social Security Account Number [ssan] 
--                     has estimated benevolence of [benevolence_score].
create table dictator (
ssan char(9) not null primary key references human (ssan),
benevolence_score integer not null default 3 
    CHECK (benevolence_score between 1 and 5) -- 1 is least, 5 is most benevolent
);

所有这三个表都在 5NF 中。 (这意味着他们也属于 3NF。)

你说

a 中不存在“IS A”关系
关系数据库

运动员“是”人类,因为它的标识符是人类标识符。在本例中,其主键是引用 human (ssan) 的外键。数据库设计者通常不会用“IS A”和“HAS A”关系来谈论,因为谓词更精确且更具表现力。通过比较这两种说法,您可以看到差异。

  • 人类“HAS A”[birth_date]
  • 人类确切出生于
    [出生日期]

最后一个故意有点不和谐。我将birth_date 列定义为时间戳——它包含日期和时间。它说明了外部谓词如何在某种程度上独立于列名。 (它还说明了谓词和列名之间的松散耦合在这里可能不是一个好主意。)

你说

但现在你再也不敢得到一个纯粹的人类
但只是人类的孩子

我不确定你所说的“纯人类”是什么意思。 。

SELECT * FROM human;

如果你的意思是除非这个人是运动员或独裁者(或其他什么),否则你就不能拥有一个人类,那么你就错了 如果特定 SSAN 中没有运动员行,则该 SSAN 识别的人不是运动员。如果独裁者中没有针对特定 SSAN 的行,则该 SSAN 标识的人不是独裁者。

Building on your comments with paxdiablo . . .

Let's look at some SQL. I could have chosen better names for the columns, but I deliberately didn't. I wasn't being lazy; I had good reasons. An external predicate is how users are supposed to interpret the contents of a table.

-- External predicate: Human is identified by 
--                         Social Security Account Number [ssan] 
--                     and has full name [full_name] 
--                     and result of last HIV test [hiv_status]
--                     and has checking account [bank_account]
--                     and was born at exactly [birth_date].
--
create table human (
ssan char(9) primary key, 
full_name varchar(35) not null,
hiv_status char(3) not null default 'Unk' 
    CHECK (hiv_status in ('Unk', 'Pos', 'Neg')),
bank_account varchar(20),
birth_date timestamp not null
);

-- External predicate: Human athlete identified by 
--                         Social Security Account Number [ssan] 
--                     has current doping status [doping_status]
create table athlete (
ssan char(9) not null primary key references human (ssan),
doping_status char(3) not null default 'Unk' 
    CHECK (doping_status in ('Unk', 'Pos', 'Neg'))
);

-- External predicate: Human dictator identified by 
--                         Social Security Account Number [ssan] 
--                     has estimated benevolence of [benevolence_score].
create table dictator (
ssan char(9) not null primary key references human (ssan),
benevolence_score integer not null default 3 
    CHECK (benevolence_score between 1 and 5) -- 1 is least, 5 is most benevolent
);

All three of those tables are in 5NF. (Which means they're also in 3NF.)

You said

there is no "IS A"-relationship in a
relational database

An athlete "IS A" human, because its identifier is a human identifier. In this case, its primary key is a foreign key that references human (ssan). Database designers don't usually talk in terms of "IS A" and "HAS A" relationships, because predicates are more precise and expressive. You can see the difference by comparing these two statements.

  • human "HAS A" [birth_date]
  • human was born at exactly
    [birth_date]

That last one is deliberately a little jarring. I defined the birth_date column as a timestamp--it accommodates both date and time. It illustrates how external predicates are to some extent independent of the column names. (It also illustrates how the loose coupling between predicates and column names might not be such a good idea here.)

You said

But now you nerver get a pure HUMAN
but only children of HUMAN

I'm not sure what you mean by "pure human". You can get all the humans by simply

SELECT * FROM human;

If you mean that you can't have a human unless the human is an athlete or dictator (or whatever), then you're mistaken. If there's no row in athlete for a specific SSAN, then the human identified by that SSAN isn't an athlete. If there's no row in dictator for a specific SSAN, then the human identified by that SSAN isn't a dictator.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文