SQL:数据库规范化,同时保留约束

发布于 2024-12-01 10:03:01 字数 2111 浏览 1 评论 0原文

假设我有下表:

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |OrganismId (int, PK)|           |SpeciesId (int, PK) |
    |SpeciesId (int, FK) |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |____________________|                      1
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |OrganismId (int, FK)  |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |PropId (int, FK)      |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |Value (varchar)       |      |____________________|        |_______________|
   |______________________|                                             1
              ∞                                                         |
              |                                                         |
              -----------------------------------------------------------

对我在这里试图表示的内容的快速解释:假设我们有一个物种列表,例如猫、狗、人类等。我们还有一组属性(缩写为 Props,因此我可以更容易在图中拟合),这适用于某些但不一定是所有物种 - 例如,这可能是尾巴长度(对于有尾巴的物种)、眼睛颜色(对于有眼睛的物种)等。SpeciesProps

是一个链接器表定义哪些属性适用于哪些物种 - 所以这里我们会有 {人类,眼睛颜色},{狗,眼睛颜色},{猫,眼睛颜色},{狗,尾巴长度},{猫,尾巴长度}。我们没有 {Human, Tail Length},因为 Tail Length 显然不是适用于人类的有效属性。

有机体表保存了物种的实际“实现”——所以这里我们可能有{人类,鲍勃},{狗,鲁弗斯}和{猫,菲利克斯}。

现在我的问题是:在 OrganismPropsValues 表中,我想存储每个生物体的属性“值” - 例如,对于鲍勃,我想存储 {鲍勃,眼睛颜色,蓝色}。对于 Rufus,我想存储 {Rufus,Eye Color,Brown} 和 {Rufus,Tail Length,20}(与 Felix 类似)。然而,我的问题是,在我详细说明的模式中,完全可以存储 {Bob, Tail Length, 10},即使 SpeciesProps 中不存在 {Human, Tail Length} 元组。如何修改此模式,以便可以强制执行 OrganismPropsValues 中 SpeciesProps 中定义的约束,同时保持足够的标准化?

Suppose I have the following tables:

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |OrganismId (int, PK)|           |SpeciesId (int, PK) |
    |SpeciesId (int, FK) |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |____________________|                      1
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |OrganismId (int, FK)  |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |PropId (int, FK)      |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |Value (varchar)       |      |____________________|        |_______________|
   |______________________|                                             1
              ∞                                                         |
              |                                                         |
              -----------------------------------------------------------

A quick explanation of what I am trying to represent here: suppose we have a list of species, such as cat, dog, human, etc. We also have a set of properties (abbreviated Props so I could fit it more easily in the diagram) which apply to some but not necessarily all species--for example, this may be tail length (for species with tails), eye color (for those with eyes), etc.

SpeciesProps is a linker table that defines which properties apply to which species-- so here we would have {Human, Eye Color}, {Dog, Eye Color}, {Cat, Eye Color}, {Dog, Tail Length}, {Cat, Tail Length}. We do not have {Human, Tail Length} because Tail Length is obviously not a valid property to apply to a human.

The Organisms table holds actual "implementations" of the species-- So here we might have {Human, Bob}, {Dog, Rufus}, and {Cat, Felix}.

Here is now my issue: in the OrganismPropsValues table, I want to store the 'values' of the properties for each organism--so for example, for Bob I want to store {Bob, Eye Color, Blue}. For Rufus, I would want to store {Rufus, Eye Color, Brown} and {Rufus, Tail Length, 20} (similar for Felix). My problem however, is that in the schema that I have detailed, it is perfectly possible to store {Bob, Tail Length, 10}, even though the {Human, Tail Length} tuple does not exist in SpeciesProps. How can I modify this schema so I can enforce the constraints defined in SpeciesProps in OrganismPropsValues, while maintaining adequate normalization?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

乱世争霸 2024-12-08 10:03:01

您正在实现 Entity-Attribute-Value 反模式。这不可能是标准化的数据库设计,因为它不是关系型的。

我建议采用类表继承设计模式:

  • 为以下对象创建一个表 :生物体,包含所有物种共有的特性。

  • 为每个物种创建一个表,其中包含特定于该物种的属性。这些表中的每一个都与有机体具有一对一的关系,但每个属性都属于自己的列。

    <前><代码> ____________________ ____________________
    |生物| |物种 |
    |--------------------| |--------------------|
    |OrganismId(int,PK)| |SpeciesId (int, PK) |
    |SpeciesId (int, FK) |∞---------1|名称 (varchar) |
    |名称(varchar)| |____________________|
    |____________________|
    1
    |
    |
    1
    ______________________
    |人类有机体 |
    |----------------------|
    |OrganismId (int, FK) |
    |性别(枚举)|
    |种族(int,FK)|
    |EyeColor(int,FK)|
    |.... |
    |______________________|

这确实意味着您将创建许多表,但请将此视为以正确的关系方式存储属性的许多实际好处的权衡:

  • 您可以适当地使用 SQL 数据类型,而不是将所有内容都视为自由格式的 varchar。
  • 您可以使用约束或查找表通过一组预定义的值来限制某些属性。
  • 您可以强制属性(即 NOT NULL)或使用其他约束。
  • 数据和索引的存储更加高效。
  • 查询对于您来说更容易编写,对于 RDBMS 来说也更容易执行。

有关此设计的更多信息,请参阅 Martin Fowler 的书企业应用程序架构模式,或我的演示< a href="http://www.slideshare.net/billkarwin/practical-object-orient-driven-models-in-sql" rel="nofollow noreferrer">实用的面向对象模型SQL,或者我的书,SQL 反模式卷 1:避免数据库编程的陷阱

You're implementing the Entity-Attribute-Value antipattern. This can't be a normalized database design, because it's not relational.

What I would suggest instead is the Class Table Inheritance design pattern:

  • Create one table for Organisms, containing properties common to all species.

  • Create one table per species, containing properties specific to that species. Each of these tables has a 1-to-1 relationship with Organisms, but each property belongs in its own column.

       ____________________             ____________________
      |     Organisms      |           |       Species      |
      |--------------------|           |--------------------|
      |OrganismId (int, PK)|           |SpeciesId (int, PK) |
      |SpeciesId (int, FK) |∞---------1|Name (varchar)      |
      |Name (varchar)      |           |____________________|
      |____________________|
                1
                |
                |
                1
       ______________________ 
      |    HumanOrganism     |
      |----------------------|
      |OrganismId (int, FK)  |
      |Sex      (enum)       |
      |Race     (int, FK)    |
      |EyeColor (int, FK)    |
      |....                  |
      |______________________|
    

This does mean you will create many tables, but consider this as a tradeoff with the many practical benefits to storing properties in a relationally correct way:

  • You can use SQL data types appropriately, instead of treating everything a free-form varchar.
  • You can use constraints or lookup tables to restrict certain properties by a predefined set of values.
  • You can make properties mandatory (i.e. NOT NULL) or use other constraints.
  • Data and indexes are stored more efficiently.
  • Queries are easier for you to write and easier for the RDBMS to execute.

For more on this design, see Martin Fowler's book Patterns of Enterprise Application Architecture, or my presentation Practical Object-Oriented Models in SQL, or my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.

物价感观 2024-12-08 10:03:01

嗯...
这是一种方法:
将 SpeciesPropsId 添加到 SpeciesProps 表中。
将 OrganismPropsValues 表中的 PropId 替换为 SpeciesPropsId。
您需要稍微更改一下约束。
需要将 SpeciesProps 添加到 OrganismPropsValues 约束中。
需要删除 OrganismPropsValues 以进行 Props 约束。

从技术上讲,您不必从 OrganismPropsValues 中删除 PropId,但如果保留它,则会导致数据冗余。

Hmm...
Here is one way to do it:
Add SpeciesPropsId into SpeciesProps table.
Replace PropId with SpeciesPropsId in the OrganismPropsValues table.
You will need to change constrains a bit.
Need to add SpeciesProps to OrganismPropsValues constrain.
Need to remove OrganismPropsValues to Props constrain.

Technically you do not have to remove PropId from OrganismPropsValues, but if you keep it it will make data redundat.

水溶 2024-12-08 10:03:01

每当您有这样的菱形依赖项时,请考虑更加重视复合主键。

具体来说,不仅通过 OrganismId 来识别生物体,还可以通过 SpeciesIdOrganismSubId 的组合来识别生物体(您仍然可以使用 OrganismId code>,但将其保留为备用键 - 为简洁起见,此处未显示)。

一旦你这样做了,你的模型就可以看起来像这样:

ER Model

这里要注意的关键是 < code>SpeciesId 沿着该菱形图的两条边“传播”。这就是为您提供了所需的限制,即无法为未针对给定物种“声明”的属性“分配值”。

顺便说一句,在命名表时使用单数。另外,考虑使用自然主键(例如 SpeciesName 而不是 SpeciesId 作为 PK) - 如果做得正确,它可以显着提高 JOIN 的速度(尤其是与集群结合使用) .

Whenever you have a diamond-shaped dependency like this, consider putting more emphasis on composite PRIMARY KEYS.

Specifically, identify the Organism not just by OrganismId, but by the combination of SpeciesId and OrganismSubId (you can still have OrganismId, but keep it as an alternate key - not show here for brevity).

Once you do that, your model can be made to look like this:

ER Model

The key thing to note here is that SpeciesId is "propagated" down both edges of this diamond-shaped graph. This is what gives you the desired restriction of not being able "assign a value" to a property that was not "declared" for the given species.

BTW, use singular when naming your tables. Also, consider using natural primary keys (e.g. SpeciesName instead of SpeciesId as PK) - if done right it can significantly increase the speed of your JOINs (especially in conjunction with clustering).

め可乐爱微笑 2024-12-08 10:03:01

实现这些约束的另一种方法是通过删除 OrganismId 并添加 No 来更改 Organism 表的 PK。然后对化合物 (SpeciesId, No) 进行 PK。因此,“Bob” 将是 (Human, 1)“Rufus” 将是 (Dog, 1) > 等。

然后,在 OrganismPropsValues 表中添加 SpeciesIdNo(删除 OrganismId)。 )

这将允许将 FK 从 OrganismPropsValues 更改为 Props 以引用 SpeciesProps

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |SpeciesId (int, FK) |           |SpeciesId (int, PK) |
    |No (int)            |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |PK (SpeciedId,No)   |                      1
    |____________________|                      |
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |SpeciesId (int, PK)   |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |No (int, PK)          |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |PropId (int, PK)      |      |____________________|        |_______________|
   |Value (varchar)       |                 1
   |FK (SpeciesId,No)     |                 |
   |FK (SpeciesId,PropId) |                 |
   |______________________|                 |
              ∞                             |
              |                             |
              -------------------------------

Another way to achieve these constraints would be to change the PK of Organism table by dropping OrganismId and adding a No. Then make PK the compound (SpeciesId, No). So, "Bob" would be (Human, 1), "Rufus" would be (Dog, 1), etc.

Then, add in the OrganismPropsValues table, the SpeciesId and the No (removing the OrganismId.)

This will allow to change the FK from OrganismPropsValues to Props to reference SpeciesProps instead:

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |SpeciesId (int, FK) |           |SpeciesId (int, PK) |
    |No (int)            |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |PK (SpeciedId,No)   |                      1
    |____________________|                      |
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |SpeciesId (int, PK)   |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |No (int, PK)          |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |PropId (int, PK)      |      |____________________|        |_______________|
   |Value (varchar)       |                 1
   |FK (SpeciesId,No)     |                 |
   |FK (SpeciesId,PropId) |                 |
   |______________________|                 |
              ∞                             |
              |                             |
              -------------------------------
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文