数据库模式相关问题

发布于 2024-10-04 08:17:34 字数 299 浏览 5 评论 0原文

我有一个关于数据库的理论问题。为了使它更具体,我想出了一个例子。

假设我有一家有产品的商店。我有很多不同的产品。并非每种产品都具有相同的适用属性。例如,我可以定义硬盘的大小(以 GB 为单位),但不能在 CPU 上使用相同的属性,因为它不适用。我想要的是一个可以动态向产品添加属性的数据库。我唯一能想到的是:

一个带有 ID、名称和描述的产品表。

一个属性表,包含 ID、Product_ID、属性和值。

这样我可能会得到一个巨大的、我认为效率不高的属性表。这已经困扰我很长时间了。有谁知道更好的解决方案来解决我的问题?

I have a kind of theoretical question about databases. To make it more concrete I have thought up an example case.

Suppose I have a store with products. I have lots of different products. Not every product has the same applicable properties. For instance, I could define the size of a harddisk in gigabytes, but cannot use that same property on a CPU, simply because it does not apply. What I want is a database where I can add properties to products dynamically. The only thing I can come up with is the following:

One product table with an ID, a Name and a Description.

One properties table with an ID, Product_ID, Property and a Value.

This way I would potentially get a gigantic, I-suppose-not-so-efficient, table of properties. This has been bugging me for a long time now. Does anyone know a better solution to my problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

青丝拂面 2024-10-11 08:17:34

这实际上正在走向第六范式,只是像你这样没有学术或经验背景的人不知道(a)它的名称和(b)规则和注意事项。这些人实现了通常所说的实体-属性-值或 EAV。如果做得正确,那就没问题,并且有成千上万的医疗系统在此类表格中包含诊断和剂量信息。如果不是,那么它的使用和维护就只是一只狗的早餐。

  1. 首先确保您拥有真正且完整的 5NF 中的产品

  2. 始终使用完整的声明性引用完整性; CHECK 约束和RULES

  3. 切勿将所有内容放入一个带有 VARCHAR() 值的表中。始终使用正确的(适用的)数据类型。这意味着您将拥有多个表,每种数据类型各一个,并且不会失去控制或完整性。

  4. 同样,任何关联表(其中存在对另一个表的多重引用[例如供应商])必须是单独的。

    • 我提供了一个具有所讨论的完全控制权的数据模型;它包括一个简单的目录,可用于验证和导航。您需要添加每个 CHECK 约束和 RULE 以确保数据和引用完整性不会丢失。这意味着,例如:
      • 对于存储在 ProductDecimal 中的 CPUSpeed 列,检查它是否在正确的值范围内
      • 对于每个子Product检查ProductType-ColumnNo组合的DataType是否正确
    • 此结构比大多数 EAV 好得多,但不完全是完整的 6NF。
  5. 保留Product中的所有必填列;仅将子产品表用于可选列。

  6. 对于每个这样的(例如,Product)表,您需要创建一个视图(虚线),它将从 EAV/6NF 表构造 5NF 行。您可能有多个视图:Product_CPUProduct_Disk

  7. 不要通过视图更新。将所有更新保存在存储过程中,并插入或更新每个列(即适用于每个特定的 Productsub-Product 表) ProductType) 一起。

  8. 巨大?商业数据库(不是免费软件)对于大型表或连接没有问题。这实际上是一个非常有效的结构,并且允许非常快速的搜索,因为表实际上是面向列的(而不是面向行的)。如果人口是巨大的,那么它就是巨大的,你自己算一下。

  9. 您还需要一张表,即Property(或Attribute)的查找表。这是目录的一部分,并且基于 ProductType

更好的解决方案是采用完整、正式的第六范式。如果您只有一个或几个需要可选列的表,则没有必要。

需要明确的是:

  • 第六范式是行由主键和最多一个属性组成。

  • 这是 6NF(至少对于 Product 表簇),然后通过 DataType 再次规范化(不是范式意义上的),以减少表的数量(否则每个属性都会有一个表)。

  • 这保留了完整的 Rdb 控制(FK、约束等);而常见的 EAV 类型不关心 DRI 和控制。

  • 这也有目录的雏形。

产品集群数据模型链接

IDEF1X 符号链接

更新

您可能对此感兴趣▶5NF 6NF讨论◀。我会在某个时候把它写下来。

This is actually moving towards Sixth Normal Form, it is just that people like you who do not have the academic or experiential background do not know the (a) name for it and (b) the rules and the caveats. Such people have implemented what is commonly know as Entity-Attribute-Value or EAV. If it is done properly, it is fine, and there are many thousands of medical system out there carrying diagnostic and dosage info in such tables. If it is not, then it is one dog's breakfast to use and maintain.

  1. First make sure you have Product in true and full 5NF.

  2. Always use full Declarative Referential Integrity; CHECK constraints and RULES.

  3. Never put all that into one table with a VARCHAR() for Value. Always use the correct (applicable) DataTypes. That means you will have several tables, one each per DataType, and there is no loss of control or integrity.

  4. Likewise any Associative tables (where there is a multiple reference to another table [eg. Vendor] ) must be separate.

    • I am providing a Data Model which has the full control discussed; it includes a simple catalogue which can be used for verification as well as navigation. You need to add every CHECK Constraint and RULE to ensure that the data and referential Integrity is not lost. That means, eg:
      • for the CPUSpeed column, which is stored in ProductDecimal, CHECK that it is in the proper range of values
      • for each sub-Product table CHECK that the DataType is correct for the ProductType-ColumnNo combination
    • This structure is way better than most EAV, and not quite the full 6NF.
      .
  5. Keep all the mandatory columns in Product; use the sub-Product tables for optional columns only.

  6. For each such (eg Product) table, you need to create a View (dotted line), which will construct the 5NF rows from the EAV/6NF tables. You may have several Views: Product_CPU, Product_Disk.

  7. Do not update via the View. Keep all your updates transactional, in a stored proc, and insert or update each of the columns (ie. the Product and sub-Product tables which are applicable, for each particular ProductType) together.

  8. Gigantic ? Commercial databases (not the freeware) have no problems with large tables or joins. This is actually a very efficient structure, and allows very fast searches, because the tables are in fact column-oriented (not row-oriented). If the population is gigantic, then it is gigantic, do your own arithmetic.

  9. You need one more table, a Lookup table for Property (or Attribute). That is part of the catalogue, and based on ProductType

The better solution is to go for full, formal Sixth Normal Form. Not necessary if you have only one or a few tables that require optional columns.

To be clear:

  • Sixth Normal Form is The Row consists of the Primary Key and, at most, one Attribute.

  • This is 6NF (for at least the Product table cluster), then Normalised again (Not in the Normal Form sense) by DataType, to reduce the no of tables (otherwise you would have one table per Attribute).

  • This retains full Rdb control (FKs, constraints, etc); whereas the common EAV types don't bother with the DRI and control.

  • This also has the rudiments of a catalogue.

Link to Product Cluster Data Model

Link to IDEF1X Notation for those who are unfamiliar with the Relational Modelling Standard.

Update

You might be interested in this ▶5NF 6NF Discussion◀. I will write it up at some point.

著墨染雨君画夕 2024-10-11 08:17:34

最初,我建议您使用一个 productproperty 表来对产品和属性之间的关系进行建模。这将允许您将许多产品与特定属性相关联。

然而,我并不热衷于将值以 1:1 的方式与每个属性一起存储。如果您有一个将属性与值关联起来的 propertyvalue 表,这可能是最好的。然后,您将放弃 productproperty 表,转而使用更丰富的 productpropertyvalue 表,该表可以完整描述产品、其属性及其值之间的关系。

也许您可以得到以下结果:

product => (ID (unique key), Name, Description)
property => (ID (unique key), Description)
propertyvalue => (ID (unique key), propertyID (foreign key), value)
productpropertyvalue => (ID (unique key), productID (foreign key), propertyValueID (foreign key))

当然,属性值可以是复杂的而不是简单的字符串或整数,但希望这能带您走向正确的方向。

Initially I'd have suggested you have a productproperty table to model the relationship between products and properties. This would allow you to associate many products with a particular property.

However, I'm not keen on the idea of having a value stored alongside each property as a 1:1. It might be best if you have a propertyvalue table that associates a property with a value. Then, you'd ditch the productproperty table in favour of having a richer productpropertyvalue table which could fully describe the relationship between a product, it's properties and their values.

Perhaps then you could have the following:

product => (ID (unique key), Name, Description)
property => (ID (unique key), Description)
propertyvalue => (ID (unique key), propertyID (foreign key), value)
productpropertyvalue => (ID (unique key), productID (foreign key), propertyValueID (foreign key))

Of course, property values could be complex rather than simple strings or integers, but hopefully this takes you in the right direction.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文