当前位置：文江博客话题详情

数据库建模 - 概念上不同的实体具有几乎相同的字段

发布于 2024-08-25 07:55:35 字数 1126 浏览 6 评论 0原文

假设您有两组概念实体：

MarketPriceDataSet 具有多个 ForwardPriceEntries
PoolPriceForecastDataSet 具有多个 PoolPriceForecastEntry

两个不同的子实体对象具有几乎相同的字段：

ForwardPriceEntry 具有

StartDate
EndDate
SimulationItemId
ForwardPrice
MarketPriceDataSetId （父表的外键）

PoolPriceForecastEntry 具有

StartDate
EndDate
MoldingItemId
ForecastPoolPrice
PoolPriceForecastDataSetId （父表的外键）

如果我将它们建模为单独的表，唯一的区别是外键，以及价格字段的名称。

关于是否应该将两个几乎相同的表合并为一个存在争议。

我想到的建模选项是：

只需将它们保留为两个独立的表，
将两个集合放在一个表中，并带有一个附加的“类型”字段，以及一个等于外键的 parent_id任一父表。这会牺牲引用完整性检查。
将这两个集合放在一个带有附加“类型”字段的表中，并创建复杂的连接表序列以保持引用完整性。

你认为我应该做什么，为什么？

其他可能相关或不相关的信息：

这两组数据松散相关：每组数据将加载到内存中进行批处理，偶尔还会加载一组PoolPriceForecastEntries 将通过从 ForwardPriceEntries 复制来生成。
MarketPriceDataSet 和 PoolPriceForecastDataSet 确实有不同的字段。可以将它们合并到一个表中，但是一半条目中的字段将毫无意义。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

め可乐爱微笑 2024-09-01 07:55:35

“关于是否应该将两个几乎相同的表合并为一个存在争议。”

有吗？

在数据库设计中，如果表的谓词不同，则表应该保持独立。也就是说，表中各行的含义是否不同。列组是否几乎或什至完全相等并不重要。

回复收藏 0 原文

幼儿园老大 2024-09-01 07:55:35

通常，在您描述的情况下，建模的主题符合一种称为“泛化专业化”的模式。先前的两个答案讨论子类型的事实支持了这一点。对象模型倾向于通过使用子类型和继承来处理生成规范。

关系模型也可以处理 gen-spec，但它有点复杂，并且通常不会在数据库入门中教授。您要做的就是拥有多个表，每个表对应一种子类型，其中包含该子类型特有的键和非键属性。与超类型相关并由所有子类型继承的数据通常合并到单个表中。

您可以使用这些表的键来玩一个技巧，从而减少存储和处理。在子类型表中，使用回到超类型条目的外键作为子类型表的主键。它保证存在，并且保证是唯一的。为什么还要费心去发明另一把钥匙呢？

如果您搜索“泛化专业化关系建模”，您将获得大约十几篇有关该主题的文章。其中一些非常好。您还可以关注标签：类表继承

回复收藏 0 原文

把时间冻结 2024-09-01 07:55:35

由于实体共享如此多的属性，您可以将它们视为子类型。

从逻辑上讲，您将拥有 PriceEntry 的超类型以及 ForwardPriceEntry 和 PoolPriceForecastEntry 的子类型。需要回答的一个问题是价格对于超类型是否是通用的。我假设是这样。

现在的问题是如何物理地实现子类型。您可以采用 3 种方法：

为每个子类型创建一个表（Rolldown）
创建包含所有属性的单个表（Rollup）
为每个子类型创建一个父类型表和一个子类型表（Identity）

这些方法中的每一种都有优点和缺点。有关每种方法的优缺点的讨论，请参阅与子类型进行身体接触。

在这种情况下，由于子类型共享如此多的属性，因此您可以采用汇总方法并创建单个表。

PriceEntry 表可能如下所示：

PriceEntryId (PK)
PriceEntryTypeCode (NN)
StartDate (NN)
EndDate 
SimulationItemId (NN)
Price (NN)
MarketPriceDataSetId (FK)
PoolPriceForecastDataSetId (FK)

您仍然可以在 MarketPrice 和 PoolForecast 列上强制执行 FK。您还可以添加表级检查约束以确保至少填充一个 FK。

然而，由于这两种亚型之间只有一种属性不同，因此许多优点和缺点并不强烈指向一个方向。因此，最终，我可能会选择保持数据模型易于理解和使用。对我来说，两个表（rolldown）方法取得了很好的平衡；从概念上讲，它比汇总更容易理解，并且在开发时它比身份更容易使用（无连接或多次插入/更新）。

Since the entities share so many attributes you could think of them as subtypes.

Logically you would have a supertype of PriceEntry and subtypes of ForwardPriceEntry and PoolPriceForecastEntry. One question to answer would be whether the price is common to the supertype. I'll assume it is.

Now the question is how to physically implement the subtype. There are 3 approaches that you can take:

Create a table for each subtype (Rolldown)
Create a single table with all attributes (Rollup)
Create one supertype table and a subtype table for each subtype (Identity)

Each of these approaches has pros and cons. For a discussion of the pros and cons of each approach see Getting Physical with Subtypes.

In this case, since the subtypes share so many attributes, you could take the rollup approach and create a single table.

The PriceEntry table could look like this:

PriceEntryId (PK)
PriceEntryTypeCode (NN)
StartDate (NN)
EndDate 
SimulationItemId (NN)
Price (NN)
MarketPriceDataSetId (FK)
PoolPriceForecastDataSetId (FK)

You can still enforce the FK on the MarketPrice and PoolForecast columns. You could also add a table level check constraint to ensure that at least one of the FKs are populated.

However, since there is only one attribute that is different between the 2 subtypes, many of the pros and cons do not strongly point in one direction. So, in the end, I would probably opt for keeping the data model easy to understand and use. For me, the two table (rolldown) approach strikes a nice balance; conceptually it's easier to understand than the rollup and when developing it's easier to use than identity (no joins or multiple inserts/updates).

回复收藏 0 原文