Boyce-Codd 范式的良好 KISS 描述是什么?

发布于 2024-07-13 16:49:51 字数 177 浏览 8 评论 0原文

什么是 KISS(保持简单、愚蠢)方法来记住 Boyce-Codd 范式是什么以及如何采用非标准化表和 BCNF?

维基百科的信息:对我来说不是很有帮助。

What is a KISS (Keep it Simple, Stupid) way to remember what Boyce-Codd normal form is and how to take a unnormalized table and BCNF it?

Wikipedia's info: not terribly helpful for me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

戏舞 2024-07-20 16:49:52

Chris Date 的定义实际上相当好,只要您理解他的意思即可:

每个属性

您的数据必须分解为单独的、不同的属性/列/值,这些属性/列/值不依赖于任何其他属性。 你的全名是一个属性。 你的生日是一个属性。 您的年龄不是一个属性,它取决于当前日期,该日期不属于您的出生日期。

必须表示一个事实

每个属性都是一个单独的事实,而不是事实的集合。 更改属性中的一位会改变整个含义。 你的生日是事实。 您的全名是事实吗? 嗯,在某些情况下确实如此,因为如果你更改姓氏,你的全名就会不同,对吗? 但对于家谱学家来说,你有姓氏和姓氏,如果你改变姓氏,你的姓氏不会改变,所以它们是不同的事实。

关于键,

有一个属性比较特殊,它是键。 键是一个属性,对于数据中的所有信息来说必须是唯一的,并且永远不能更改。 您的全名不是密钥,因为它可以更改。 您的社会保险号码不是关键,因为它们会被重复使用。 您的 SSN 加出生日期不是关键,即使该组合永远无法重复使用,因为属性不能是两个事实的组合。 GUID 是一把钥匙。 您递增且永远不会重复使用的数字是关键。

整个密钥,

单独的密钥必须足以[并且必要!]来识别您的价值观; 您不能使用不同的键表示相同的数据,键列的子集也不足以识别事实。
假设您有一个包含 GUID 键、名称和地址值的地址簿。 如果相同的名称代表不同的人并且不是“相同的数据”,则使用不同的键出现两次是可以的。
如果会计部门的玛丽·琼斯 (Mary Jones) 将她的名字更改为玛丽·史密斯 (Mary Smith),那么销售部门的玛丽·琼斯 (Mary Jones) 也不会更改她的名字。
另一方面,如果 Mary Smith 和 John Smith 具有相同的街道地址并且确实是同一个地方,则这是不允许的。 您必须使用街道地址和新密钥创建一个新的键/值对。

您也不允许使用这个新的单个街道地址的密钥作为地址簿中的值,因为现在相同的街道地址密钥将表示两次。
相反,您必须使用地址簿键和街道地址键的值创建第三个键/值对; 您可以通过在这组值中匹配某人的图书密钥和地址密钥来找到该人的街道地址。

除了钥匙,什么也没有 除了

钥匙之外,不能有任何其他东西可以识别你的价值观。 例如,如果允许您使用“泰姬陵”的地址(假设只有一个),则不允许在同一记录中使用城市值,
因为如果您知道地址,您也就知道了城市。 这也开启了在不同城市有不止一座泰姬陵的可能性。
相反,您必须再次创建具有唯一值的辅助位置键,例如泰姬陵、华盛顿特区的白宫等及其城市。
或者禁止使用某个城市特有的“地址”。

所以帮帮我吧,科德。

Chris Date's definition is actually quite good, so long as you understand what he means:

Each attribute

Your data must be broken into separate, distinct attributes/columns/values which do not depend on any other attributes. Your full name is an attribute. Your birthdate is an attribute. Your age is not an attribute, it depends on the current date which is not part of your birthdate.

must represent a fact

Each attribute is a single fact, not a collection of facts. Changing one bit in an attribute changes the whole meaning. Your birthdate is a fact. Is your full name a fact? Well, in some cases it is, because if you change your surname your full name is different, right? But to a genealogist you have a surname and a family name, and if you change your surname your family name does not change, so they are separate facts.

about the key,

One attribute is special, it's a key. The key is an attribute that must be unique for all information in your data and must never change. Your full name is not a key because it can change. Your Social Insurance Number is not a key because they get reused. Your SSN plus birthdate is not a key, even if the combination can never be reused, because an attribute cannot be a combination of two facts. A GUID is a key. A number you increment and never reuse is a key.

the whole key,

The key alone must be sufficient [and necessary!] to identify your values; you cannot have the same data represented by different keys, nor can a subset of the key columns be sufficient to identify the fact.
Suppose you had an address book with a GUID key, name and address values. It is OK to have the same name appearing twice with different keys if they represent different people and are not the "same data".
If Mary Jones in accounting changes her name to Mary Smith, Mary Jones in Sales does not change her name as well.
On the other hand, if Mary Smith and John Smith have the same street address and it really is the same place, this is not allowed. You have to create a new key/value pair with the street address and a new key.

You are also not allowed to use the key for this new single street address as a value in the address book since now the same street address key would be represented twice.
Instead, you have to make a third key/value pair with values of the address book key and the street address key; you find a person's street address by matching their book key and address key in this group of values.

and nothing but the key

There must be nothing other than the key that identifies your values. For example, if you are allowed an address of "The Taj Mahal" (assuming there is only one) you are not allowed a city value in the same record,
since if you know the address you would also know the city. This would also open up the possibility of there being more than one Taj Mahal in a different city.
Instead, you have to again create a secondary Location key with unique values like the Taj, the White House in DC, and so on, and their cities.
Or forbid "addresses" that are unique to a city.

So help me, Codd.

怀念你的温柔 2024-07-20 16:49:52

以下是维基百科页面上第三范式的一些有用摘录:

Bill Kent 定义了第三范式方式:

每个非关键属性“必须提供
关于钥匙的事实,整个钥匙,
除了钥匙什么也没有。”

要求非关键属性
依赖于“整个密钥”确保
一个表是 2NF 的; 更远
要求非关键属性
依赖“除了钥匙”
确保该表符合 3NF。

Chris Date 采用 Kent 的助记符来定义 Boyce-Codd 范式:

“每个属性必须代表一个事实
关于密钥、整个密钥以及
除了钥匙什么都没有。”这里
需求涉及到每一个
表中的属性,而不仅仅是
非关键属性。

当表具有多个复合候选键,并且一个候选键中的属性依赖于另一个候选键的部分时,这一点就会发挥作用。 第三范式不会禁止这一点,因为它排除了关键属性。 但 BCNF 也将该规则应用于关键属性。

至于如何使一个表满足BCNF,您需要用另一个属性来表示额外的依赖关系,并且可能通过将属性拆分到另一个表中来表示。

Here are some helpful excerpts from the Wikipedia page on Third Normal Form:

Bill Kent defines Third Normal Form this way:

Each non-key attribute "must provide
a fact about the key, the whole key,
and nothing but the key."

Requiring that non-key attributes be
dependent on "the whole key" ensures
that a table is in 2NF; further
requiring that non-key attributes be
dependent on "nothing but the key"
ensures that the table is in 3NF.

Chris Date adapts Kent's mnemonic to define Boyce-Codd Normal Form:

"Each attribute must represent a fact
about the key, the whole key, and
nothing but the key." Here the
requirement is concerned with every
attribute in the table, not just
non-key attributes.

This comes into play when a table has multiple compound candidate keys, and an attribute within one candidate keys has a dependency on a part of another candidate key. Third Normal Form wouldn't prohibit this, because it excludes key attributes. But BCNF applies the rule to key attributes as well.

As for how to make a table satisfy BCNF, you need to represent the extra dependency, with another attribute and possibly by splitting attributes into another table.

下雨或天晴 2024-07-20 16:49:52

我用谷歌搜索“boyce codd范式”,在维基百科之后这是第二个结果。 我的教科书对关系数据库管理系统给出了一个非常简单的定义:

每个重要 FD 的左侧都必须是超级键。

- Garcia-Molina、Ullman 和 Widom 所著的“Database Systems The Complete Book”。

I googled "boyce codd normal form" and after wikipedia this is the second result. My textbook gives a very simple definition in terms of relational database management systems:

The left side of every nontrivial FD must be a superkey.

-"Database Systems The Complete Book" by Garcia-Molina, Ullman and Widom.

夕色琉璃 2024-07-20 16:49:52

我读过的最好的非正式答案是,在 BCNF 中,每个函数依赖项中的每个“箭头”都是候选键外的“箭头”。 我不记得来源了,但这可能是克里斯·戴特写的。

The best informal answer I've read is that, in BCNF, every "arrow" in every functional dependency is an "arrow" out of a candidate key. I don't recall the source, but it was probably something Chris Date wrote.

ヅ她的身影、若隐若现 2024-07-20 16:49:52

基本上博伊斯-科德是“第五范式”。 它可以通过数据模型中“属性实体”的存在来直观地识别,例如类型(例如角色、状态、流程状态、位置类型、电话类型等)。
属性实体(子子类型)是进一步对类级别实体进行分类的有限值集的列表。 因此,您可能有电话类型(“移动”、“办公桌”、“VOIP”)、电子邮件帐户类型(“企业”、“个人”、“游戏”)、角色(项目经理、数据建模师、超级模特)等。
另一个形态线索是超类型(又名大师班、超类、元实体)的存在,例如当事人(子类型是公司、个人等)。

它基本上是原子或叶级别的分类法(..不,视频并不那么令人兴奋); 请参阅上面 Bill Karwin 的评论以获得更多技术解释。

Boyce-Codd 级别模型本质上是高度详细的逻辑模型,源自更简单的基于业务的概念模型。 **它们通常不会在物理模型中完全实现,因为针对性能(或功能简单性)的 PDM 优化可能会导致超类型和属性实体作为 UI 中的下拉列表或在幕后逻辑中进行管理在应用程序中或在数据库约束和方法中强制引用完整性。 (即,它们可能最终作为 PDM 模式中的查找表,或者它们可能由代码处理而不在数据库中表示)。

那么,如果它们最终可能不会出现在 PDM 中,为什么还要这样做呢? 出于同样的原因,您在“优化”之前构建了一个良好的 3NF 模型,以便数据库结构反映现实世界,因此比我们继承的典型拼凑更稳定,并且必须做出英雄行为才能作为我们的业务/客户工作要求发生变化。

Basically Boyce-Codd is "fifth normal form". It is visually recognizable by the existance of "Attributive entities" in the data model, for things like Types (e.g. roles, status, process state, location-type, phone-type, etc).
The attributive entities (sub-subtypes) are lists of finite sets of values that further categorize a class level entity. So you may have a phone-type ('mobile', ' desk', 'VOIP') email account type ('business', 'personal', 'gaming'), role (project manager, data modeler, super model) etc.
Another morphological clue is the existance of super-types, (aka. master-classes, super-classes, meta-entities) such as Parties (subtypes being company, person, etc.).

It's basically Taxonomy gone wild (..no the video is not that exciting) to the atomic or leaf-level; see Bill Karwin's comment above for a more technical explanation.

Boyce-Codd level models are essentially highly detailed logical models, derived from more simplistic business-based conceptual models. **They are typically NOT implemented ver batim in the PHYSICAL model, because PDM optimization for performance (or functional simplicity) may result in the super-types and attributive entities being managed as drop-down lists in UIs, or in behind the scenes logic in the application, or in database constraints and methods to enforce referential integrity. (i.e. they may end up as look-up tables in the PDM schema, or they may be handled by code and not represented in the database).

So - why do them if they may not end up in the PDM? For the same reason you build a good 3NF model before you 'optimize', so that the database structure reflects the real world and is hence more stable than the typical kludges we inherit and have to do heroic acts to make work as our business/clients requirements change.

相权↑美人 2024-07-20 16:49:52

很多时候,听从自己的直觉是最容易的,这会是自然而然的。 一般来说,如果你满足了3NF,那么你就满足了BCNF。 这不包括 ERD 的详细分析或示例,但根据 Codd 的说法,有 13 条规则。 我发现最好遵循这些规则,但永远记住,没有一种正确的做事方法,所以宽松地遵循它们。 因此,关于 RDBMS,规则如下:

http: //www.87android.com/12-rules-of-relational-database-model-by-codd/

这可能无法直接回答问题,但如果您询问如何到达 BCNF 或简单的如果你没有记住它,那么你就对标准化还不够了解。 但这并不重要。 关系数据库有多种形式,但做得好的很少。 您能做的最好的事情就是了解关系意味着什么,遵循上述规则,并且不要担心规范化的程度。 标准化过程消除了数据的重复。 通过迁移功能依赖性,每个级别都更是如此。 记住这一点,你会没事的,你的直觉和智慧会完成剩下的事情。

Often times it is easiest to listen to your gut and this will come naturally. Generally speaking, if you meet 3NF you have met BCNF. This doesn't cover detailed analysis of an ERD or have examples but there are thirteen rules according to Codd. I find it best to follow these rules but always remember there is no one correct way to do things so follow them loosely. So regarding the RDBMS, here are the rules:

http://www.87android.com/12-rules-of-relational-database-model-by-codd/

This may not answer the question directly, but if you are asking about how to get to BCNF or an easy way to remember it then you don't understand normalization well enough. That is of no concern though. Relational databases take many forms and very few are done well. The best thing you can do is know what it means to be relational, follow the rules above, and do not worry about the level of normalization. The process of normalization eliminates the duplication of data. Each level more so by moving into migration of functional dependencies. Keep that in mind and you will be fine, your gut and intellect will do the rest.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文