SQL：数据库规范化，同时保留约束

发布于 2024-12-01 10:03:01 字数 2111 浏览 4 评论 0原文

假设我有下表：

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |OrganismId (int, PK)|           |SpeciesId (int, PK) |
    |SpeciesId (int, FK) |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |____________________|                      1
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |OrganismId (int, FK)  |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |PropId (int, FK)      |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |Value (varchar)       |      |____________________|        |_______________|
   |______________________|                                             1
              ∞                                                         |
              |                                                         |
              -----------------------------------------------------------

对我在这里试图表示的内容的快速解释：假设我们有一个物种列表，例如猫、狗、人类等。我们还有一组属性（缩写为 Props，因此我可以更容易在图中拟合），这适用于某些但不一定是所有物种 - 例如，这可能是尾巴长度（对于有尾巴的物种）、眼睛颜色（对于有眼睛的物种）等。SpeciesProps

是一个链接器表定义哪些属性适用于哪些物种 - 所以这里我们会有 {人类，眼睛颜色}，{狗，眼睛颜色}，{猫，眼睛颜色}，{狗，尾巴长度}，{猫，尾巴长度}。我们没有 {Human, Tail Length}，因为 Tail Length 显然不是适用于人类的有效属性。

有机体表保存了物种的实际“实现”——所以这里我们可能有{人类，鲍勃}，{狗，鲁弗斯}和{猫，菲利克斯}。

现在我的问题是：在 OrganismPropsValues 表中，我想存储每个生物体的属性“值” - 例如，对于鲍勃，我想存储 {鲍勃，眼睛颜色，蓝色}。对于 Rufus，我想存储 {Rufus，Eye Color，Brown} 和 {Rufus，Tail Length，20}（与 Felix 类似）。然而，我的问题是，在我详细说明的模式中，完全可以存储 {Bob, Tail Length, 10}，即使 SpeciesProps 中不存在 {Human, Tail Length} 元组。如何修改此模式，以便可以强制执行 OrganismPropsValues 中 SpeciesProps 中定义的约束，同时保持足够的标准化？

原文

Suppose I have the following tables:

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |OrganismId (int, PK)|           |SpeciesId (int, PK) |
    |SpeciesId (int, FK) |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |____________________|                      1
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |OrganismId (int, FK)  |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |PropId (int, FK)      |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |Value (varchar)       |      |____________________|        |_______________|
   |______________________|                                             1
              ∞                                                         |
              |                                                         |
              -----------------------------------------------------------

A quick explanation of what I am trying to represent here: suppose we have a list of species, such as cat, dog, human, etc. We also have a set of properties (abbreviated Props so I could fit it more easily in the diagram) which apply to some but not necessarily all species--for example, this may be tail length (for species with tails), eye color (for those with eyes), etc.

SpeciesProps is a linker table that defines which properties apply to which species-- so here we would have {Human, Eye Color}, {Dog, Eye Color}, {Cat, Eye Color}, {Dog, Tail Length}, {Cat, Tail Length}. We do not have {Human, Tail Length} because Tail Length is obviously not a valid property to apply to a human.

The Organisms table holds actual "implementations" of the species-- So here we might have {Human, Bob}, {Dog, Rufus}, and {Cat, Felix}.

Here is now my issue: in the OrganismPropsValues table, I want to store the 'values' of the properties for each organism--so for example, for Bob I want to store {Bob, Eye Color, Blue}. For Rufus, I would want to store {Rufus, Eye Color, Brown} and {Rufus, Tail Length, 20} (similar for Felix). My problem however, is that in the schema that I have detailed, it is perfectly possible to store {Bob, Tail Length, 10}, even though the {Human, Tail Length} tuple does not exist in SpeciesProps. How can I modify this schema so I can enforce the constraints defined in SpeciesProps in OrganismPropsValues, while maintaining adequate normalization?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

乱世争霸 2024-12-08 10:03:01

您正在实现 Entity-Attribute-Value 反模式。这不可能是标准化的数据库设计，因为它不是关系型的。

我建议采用类表继承设计模式：

为以下对象创建一个表：生物体，包含所有物种共有的特性。
为每个物种创建一个表，其中包含特定于该物种的属性。这些表中的每一个都与有机体具有一对一的关系，但每个属性都属于自己的列。
<前><代码> ____________________ ____________________
|生物| |物种 |
|--------------------| |--------------------|
|OrganismId（int，PK）| |SpeciesId (int, PK) |
|SpeciesId (int, FK) |∞---------1|名称 (varchar) |
|名称（varchar）| |____________________|
|____________________|
1
|
|
1
______________________
|人类有机体 |
|----------------------|
|OrganismId (int, FK) |
|性别（枚举）|
|种族（int，FK）|
|EyeColor（int，FK）|
|.... |
|______________________|

这确实意味着您将创建许多表，但请将此视为以正确的关系方式存储属性的许多实际好处的权衡：

您可以适当地使用 SQL 数据类型，而不是将所有内容都视为自由格式的 varchar。
您可以使用约束或查找表通过一组预定义的值来限制某些属性。
您可以强制属性（即 NOT NULL）或使用其他约束。
数据和索引的存储更加高效。
查询对于您来说更容易编写，对于 RDBMS 来说也更容易执行。

有关此设计的更多信息，请参阅 Martin Fowler 的书企业应用程序架构模式，或我的演示< a href="http://www.slideshare.net/billkarwin/practical-object-orient-driven-models-in-sql" rel="nofollow noreferrer">实用的面向对象模型SQL，或者我的书，SQL 反模式卷 1：避免数据库编程的陷阱。

You're implementing the Entity-Attribute-Value antipattern. This can't be a normalized database design, because it's not relational.

What I would suggest instead is the Class Table Inheritance design pattern:

Create one table for Organisms, containing properties common to all species.

Create one table per species, containing properties specific to that species. Each of these tables has a 1-to-1 relationship with Organisms, but each property belongs in its own column.

   ____________________             ____________________
  |     Organisms      |           |       Species      |
  |--------------------|           |--------------------|
  |OrganismId (int, PK)|           |SpeciesId (int, PK) |
  |SpeciesId (int, FK) |∞---------1|Name (varchar)      |
  |Name (varchar)      |           |____________________|
  |____________________|
            1
            |
            |
            1
   ______________________ 
  |    HumanOrganism     |
  |----------------------|
  |OrganismId (int, FK)  |
  |Sex      (enum)       |
  |Race     (int, FK)    |
  |EyeColor (int, FK)    |
  |....                  |
  |______________________|

This does mean you will create many tables, but consider this as a tradeoff with the many practical benefits to storing properties in a relationally correct way:

You can use SQL data types appropriately, instead of treating everything a free-form varchar.
You can use constraints or lookup tables to restrict certain properties by a predefined set of values.
You can make properties mandatory (i.e. NOT NULL) or use other constraints.
Data and indexes are stored more efficiently.
Queries are easier for you to write and easier for the RDBMS to execute.

For more on this design, see Martin Fowler's book Patterns of Enterprise Application Architecture, or my presentation Practical Object-Oriented Models in SQL, or my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.

回复收藏 0 原文

物价感观 2024-12-08 10:03:01

嗯...
这是一种方法：
将 SpeciesPropsId 添加到 SpeciesProps 表中。
将 OrganismPropsValues 表中的 PropId 替换为 SpeciesPropsId。
您需要稍微更改一下约束。
需要将 SpeciesProps 添加到 OrganismPropsValues 约束中。
需要删除 OrganismPropsValues 以进行 Props 约束。

从技术上讲，您不必从 OrganismPropsValues 中删除 PropId，但如果保留它，则会导致数据冗余。

回复收藏 0 原文

水溶 2024-12-08 10:03:01

每当您有这样的菱形依赖项时，请考虑更加重视复合主键。

具体来说，不仅通过 OrganismId 来识别生物体，还可以通过 SpeciesId 和 OrganismSubId 的组合来识别生物体（您仍然可以使用 OrganismId code>，但将其保留为备用键 - 为简洁起见，此处未显示）。

一旦你这样做了，你的模型就可以看起来像这样：

ER Model

这里要注意的关键是 < code>SpeciesId 沿着该菱形图的两条边“传播”。这就是为您提供了所需的限制，即无法为未针对给定物种“声明”的属性“分配值”。

顺便说一句，在命名表时使用单数。另外，考虑使用自然主键（例如 SpeciesName 而不是 SpeciesId 作为 PK） - 如果做得正确，它可以显着提高 JOIN 的速度（尤其是与集群结合使用） .

回复收藏 0 原文

め可乐爱微笑 2024-12-08 10:03:01

实现这些约束的另一种方法是通过删除 OrganismId 并添加 No 来更改 Organism 表的 PK。然后对化合物 (SpeciesId, No) 进行 PK。因此，“Bob” 将是 (Human, 1)，“Rufus” 将是 (Dog, 1) > 等。

然后，在 OrganismPropsValues 表中添加 SpeciesId 和 No（删除 OrganismId）。）

这将允许将 FK 从 OrganismPropsValues 更改为 Props 以引用 SpeciesProps：

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |SpeciesId (int, FK) |           |SpeciesId (int, PK) |
    |No (int)            |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |PK (SpeciedId,No)   |                      1
    |____________________|                      |
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |SpeciesId (int, PK)   |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |No (int, PK)          |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |PropId (int, PK)      |      |____________________|        |_______________|
   |Value (varchar)       |                 1
   |FK (SpeciesId,No)     |                 |
   |FK (SpeciesId,PropId) |                 |
   |______________________|                 |
              ∞                             |
              |                             |
              -------------------------------

Another way to achieve these constraints would be to change the PK of Organism table by dropping OrganismId and adding a No. Then make PK the compound (SpeciesId, No). So, "Bob" would be (Human, 1), "Rufus" would be (Dog, 1), etc.

Then, add in the OrganismPropsValues table, the SpeciesId and the No (removing the OrganismId.)

This will allow to change the FK from OrganismPropsValues to Props to reference SpeciesProps instead:

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |SpeciesId (int, FK) |           |SpeciesId (int, PK) |
    |No (int)            |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |PK (SpeciedId,No)   |                      1
    |____________________|                      |
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |SpeciesId (int, PK)   |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |No (int, PK)          |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |PropId (int, PK)      |      |____________________|        |_______________|
   |Value (varchar)       |                 1
   |FK (SpeciesId,No)     |                 |
   |FK (SpeciesId,PropId) |                 |
   |______________________|                 |
              ∞                             |
              |                             |
              -------------------------------

回复收藏 0 原文

~没有更多了~