数据库建模 - 概念上不同的实体具有几乎相同的字段

发布于 2024-08-25 07:55:35 字数 1126 浏览 6 评论 0原文

假设您有两组概念实体:

  • MarketPriceDataSet 具有多个 ForwardPriceEntries
  • PoolPriceForecastDataSet 具有多个 PoolPriceForecastEntry

两个不同的子实体对象具有几乎相同的字段:

ForwardPriceEntry 具有

  • StartDate
  • EndDate
  • SimulationItemId
  • ForwardPrice
  • MarketPriceDataSetId (父表的外键)

PoolPriceForecastEntry 具有

  • StartDate
  • EndDate
  • MoldingItemId
  • ForecastPoolPrice
  • PoolPriceForecastDataSetId (父表的外键)

如果我将它们建模为单独的表,唯一的区别是外键,以及价格字段的名称。

关于是否应该将两个几乎相同的表合并为一个存在争议。

我想到的建模选项是:

  • 只需将它们保留为两个独立的表,
  • 将两个集合放在一个表中,并带有一个附加的“类型”字段,以及一个等于外键的 parent_id任一父表。这会牺牲引用完整性检查。
  • 将这两个集合放在一个带有附加“类型”字段的表中,并创建复杂的连接表序列以保持引用完整性。

你认为我应该做什么,为什么?


其他可能相关或不相关的信息:

  • 这两组数据松散相关:每组数据将加载到内存中进行批处理,偶尔还会加载一组PoolPriceForecastEntries 将通过从 ForwardPriceEntries 复制来生成。

  • MarketPriceDataSetPoolPriceForecastDataSet 确实有不同的字段。可以将它们合并到一个表中,但是一半条目中的字段将毫无意义。

Suppose you have two sets of conceptual entities:

  • MarketPriceDataSet which has multiple ForwardPriceEntries
  • PoolPriceForecastDataSet which has multiple PoolPriceForecastEntry

Both different child objects have near identical fields:

ForwardPriceEntry has

  • StartDate
  • EndDate
  • SimulationItemId
  • ForwardPrice
  • MarketPriceDataSetId (foreign key to parent table)

PoolPriceForecastEntry has

  • StartDate
  • EndDate
  • SimulationItemId
  • ForecastPoolPrice
  • PoolPriceForecastDataSetId (foreign key to parent table)

If I modelled them as separate tables, the only difference would be the foreign key, and the name of the price field.

There has been a debate as to whether the two near identical tables should be merged into one.

Options I've thought of to model this is:

  • Just keep them as two independent, separate tables
  • Have both sets in the one table with an additional "type" field, and a parent_id equalling a foreign key to either parent table. This would sacrifice referential integrity checks.
  • Have both sets in the one table with an additional "type" field, and create a complicated sequence of joining tables to maintain referential integrity.

What do you think I should do, and why?


Other information which may or may not be relevant:

  • The two sets of data are loosely related: one set of each will be loaded into memory for batch processing, and occasionaly a set of PoolPriceForecastEntries will be generated by copying from ForwardPriceEntries.

  • MarketPriceDataSet and PoolPriceForecastDataSet do have differing fields. It would be possible to merge them into one table, but then you would have fields that are meaningless in half the entries.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

め可乐爱微笑 2024-09-01 07:55:35

“关于是否应该将两个几乎相同的表合并为一个存在争议。”

有吗?

在数据库设计中,如果表的谓词不同,则表应该保持独立。也就是说,表中各行的含义是否不同。列组是否几乎或什至完全相等并不重要。

"There has been a debate as to whether the two near identical tables should be merged into one."

Has there ?

In database design, tables should remain separate if their predicates are different. That is, if the meaning attached to the rows in the table is different. Whether the set of columns is almost or even completely equal, is immaterial.

幼儿园老大 2024-09-01 07:55:35

通常,在您描述的情况下,建模的主题符合一种称为“泛化专业化”的模式。先前的两个答案讨论子类型的事实支持了这一点。对象模型倾向于通过使用子类型和继承来处理生成规范。

关系模型也可以处理 gen-spec,但它有点复杂,并且通常不会在数据库入门中教授。您要做的就是拥有多个表,每个表对应一种子类型,其中包含该子类型特有的键和非键属性。与超类型相关并由所有子类型继承的数据通常合并到单个表中。

您可以使用这些表的键来玩一个技巧,从而减少存储和处理。在子类型表中,使用回到超类型条目的外键作为子类型表的主键。它保证存在,并且保证是唯一的。为什么还要费心去发明另一把钥匙呢?

如果您搜索“泛化专业化关系建模”,您将获得大约十几篇有关该主题的文章。其中一些非常好。您还可以关注标签:

Very often, in the situation you described, the subject being modeled fits a pattern called "generalization specialization". The fact that two prior answers discuss subtypes supports this. An object model tends to deal with gen-spec by using subtypes and inheritance.

The relational model can also deal with gen-spec, but it's a little more intricate and is not usally taught in database primers. What you do is have several tables, one for each subtype, holding a key and non key attributes that are peculiar to the subtype. Data that is relevant to the supertype, and inherited by all the subtypes, is generally consolidated into a single table.

There is a trick you can play with the keys to these tables that cuts down on both storage and processing. In the subtype tables, use the foreign key back to the supertype entry as the primary key to the subtype table. It's guaranteed to exist, and it's guaranteed to be unique. Why bother to invent yet another key?

If you search on "generalization specialization relational modeling" you'll get about a dozen articles on the subject. some of them are quite good. You can also follow the tag:

把时间冻结 2024-09-01 07:55:35

由于实体共享如此多的属性,您可以将它们视为子类型。

从逻辑上讲,您将拥有 PriceEntry 的超类型以及 ForwardPriceEntry 和 PoolPriceForecastEntry 的子类型。需要回答的一个问题是价格对于超类型是否是通用的。我假设是这样。

现在的问题是如何物理地实现子类型。您可以采用 3 种方法:

  1. 为每个子类型创建一个表(Rolldown)
  2. 创建包含所有属性的单个表(Rollup)
  3. 为每个子类型创建一个父类型表和一个子类型表(Identity)

这些方法中的每一种都有优点和缺点。有关每种方法的优缺点的讨论,请参阅 与子类型进行身体接触

在这种情况下,由于子类型共享如此多的属性,因此您可以采用汇总方法并创建单个表。

PriceEntry 表可能如下所示:

PriceEntryId (PK)
PriceEntryTypeCode (NN)
StartDate (NN)
EndDate 
SimulationItemId (NN)
Price (NN)
MarketPriceDataSetId (FK)
PoolPriceForecastDataSetId (FK)

您仍然可以在 MarketPrice 和 PoolForecast 列上强制执行 FK。您还可以添加表级检查约束以确保至少填充一个 FK。

然而,由于这两种亚型之间只有一种属性不同,因此许多优点和缺点并不强烈指向一个方向。因此,最终,我可能会选择保持数据模型易于理解和使用。对我来说,两个表(rolldown)方法取得了很好的平衡;从概念上讲,它比汇总更容易理解,并且在开发时它比身份更容易使用(无连接或多次插入/更新)。

Since the entities share so many attributes you could think of them as subtypes.

Logically you would have a supertype of PriceEntry and subtypes of ForwardPriceEntry and PoolPriceForecastEntry. One question to answer would be whether the price is common to the supertype. I'll assume it is.

Now the question is how to physically implement the subtype. There are 3 approaches that you can take:

  1. Create a table for each subtype (Rolldown)
  2. Create a single table with all attributes (Rollup)
  3. Create one supertype table and a subtype table for each subtype (Identity)

Each of these approaches has pros and cons. For a discussion of the pros and cons of each approach see Getting Physical with Subtypes.

In this case, since the subtypes share so many attributes, you could take the rollup approach and create a single table.

The PriceEntry table could look like this:

PriceEntryId (PK)
PriceEntryTypeCode (NN)
StartDate (NN)
EndDate 
SimulationItemId (NN)
Price (NN)
MarketPriceDataSetId (FK)
PoolPriceForecastDataSetId (FK)

You can still enforce the FK on the MarketPrice and PoolForecast columns. You could also add a table level check constraint to ensure that at least one of the FKs are populated.

However, since there is only one attribute that is different between the 2 subtypes, many of the pros and cons do not strongly point in one direction. So, in the end, I would probably opt for keeping the data model easy to understand and use. For me, the two table (rolldown) approach strikes a nice balance; conceptually it's easier to understand than the rollup and when developing it's easier to use than identity (no joins or multiple inserts/updates).

带刺的爱情 2024-09-01 07:55:35

如果您打算以大致相同的方式使用两个子类型中的数据,我建议将所有数据放入包含所有五个字段的单个表中...

  • StartDate
  • EndDate
  • SimulationItemId
  • ForwardPrice
  • MarketPriceDataSetId(父级的外键)表)

...并将 null 放入 ForwardPrice 或 ForecastPoolPrice 字段中,具体取决于特定行支持的父实体。每行上的空值的空间成本几乎不存在,表上的压缩成功率将非常高,但对于索引和读取,您的性能将令人印象深刻。

If it's your intention to use the data from the two sub-types in roughly the same way I'd suggest putting all the data into a single table housing all five of the fields...

  • StartDate
  • EndDate
  • SimulationItemId
  • ForwardPrice
  • MarketPriceDataSetId (foreign key to parent table)

... and place a null into either the ForwardPrice or ForecastPoolPricedepending field pending the parent entity a particular row is supporting. The space cost of the null on each row would be virtually non-existent and compression success would be very high on the table but for indexing and reads your performance would be impressive.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文