数据建模:超类型/子类型

发布于 2024-10-13 11:34:13 字数 382 浏览 7 评论 0原文

寻求找出对以下需求进行建模的正确方法。

  1. 有 3 种类型的“当事人”需要关注:粉丝、乐队和乐队成员。
  2. 该乐队成员将始终与乐队相关联,并且也可以是任何乐队的粉丝。
  3. 粉丝、乐队和乐队成员之间有共同的属性,但这三者也都有自己独特的属性。
  4. 粉丝可以是任何乐队的粉丝,也可以不是任何乐队的粉丝。

这只是一个更大想法的一小部分,但它在扩展模型时造成了混乱。我相信它必须是图 2 或其他一些选项,因为我不明白 BandMember 如何与第一个模型中的 Band 关联。

我很感激任何意见。

替代文本

Looking to figure out the proper way to model the below requirements.

  1. There are 3 types of “parties” to be concerned with, a Fan, a Band, and a BandMember.
  2. That BandMember will always be associated with a Band and can also be a Fan of any band.
  3. There are common attributes between a Fan, a Band, and a BandMember, but each of these 3 will also have their own unique attributes.
  4. A Fan can be a fan of of any Band or none at all

This is a small part of a bigger thought but it is creating confusion in expanding the model. I believe it would have to be diagram 2 or some other option since I don't see how a BandMember can be associated with a Band in the first model.

I appreciate any input.

alt text

alt text

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

怀里藏娇 2024-10-20 11:34:13

警告

  1. 首先,有一些警告可以帮助您了解其局限性。所有正在使用或存储的数据都需要一起考虑/建模。例如。无论如何,您已经在“扩展模型时造成混乱”中发现了这一点。从我的角度来看,不知道Parties(子类型)如何与其他实体相关,这限制了我提供完全正确的答案(不会改变)。

    由于您分两部分提供数据,因此答案将分为两部分,第二部分将需要对第一个模型进行更改。不是抱怨,只是提前建议您,因为如果我预先看到所有数据,就可以避免这种情况。

  2. 您认识到需要 (a) 对数据进行建模以及 (b) 经历概念、逻辑和物理科学(已有 30 多年的记录),这真是太棒了。这正是我所做的。您无法想象正式流程节省的时间和精力。

    • 但是,这并没有出现在我在 SO 的回答中,因为这是一个“问答”网站,我必须在提问者的层面上回答。 (我比其他人更全面地回答问题,即使这样,也会引起负面评论!)。请放心,我会按照正式的顺序进行。
  3. 必须提到的是,关系建模是 20 世纪 80 年代业界巨头 EF Codd 博士和 R Brown 的作品。该方法论基于他们的工作。 IDEF1X 于 1993 年成为 NIST 标准。当我回答数据建模问题时,我并没有提供我写过一本书的个人方法,我站在巨人的肩膀上。

    • 我遵循 EF Codd 博士的关系模型

    • 我拒绝Date的作品;达尔文;费金;等等,因为它与关系模型相矛盾,所以它是反关系模型。

    • 相比之下,ERD(P Chen)是前关系型、前 IDEF1X 型和原始型。它没有标识符的概念,它无法处理关系键。令人震惊的是,ERD 仍然被教导为“关系型”,而 IDEF1X 却受到压制。 30 年。

数据建模

  1. 是的,这里的超类型-子类型结构是正确的。不幸的是,它并不常见,因此也未被普遍理解。
  • 子类型早在关系模型出现之前就已经存在,并且仍然存在。 IDEF1X 是唯一的关系数据建模标准,有特定的符号。 ERD什么也没有。
  1. 另外,即使在实现子类型的情况下,它们也是以非常有限的方式实现的。子类型具有改变父类型的角色的作用。正确的实现确实非常罕见,我在任何地方都没有看到过这种情况(当然除了我自己为客户和一些高端供应商实现的数据库实现)。

  2. 重点是,这可能看起来“复杂”,但事实并非如此,它实际上非常简单。

    这就是 Ken Downs 和 Chris Behrens 将建模的简单性(高度可扩展)与未建模的实现(不正确且不可扩展)混淆的地方,因为 Martin Fowler 等矮人建议采用简单化的方法。无意冒犯,我知道人们会依恋并捍卫他们所知道的东西,无论这些知识有多么有限。

    • 请注意,每个子类型本身也是一个完全有效的实体(当我们到达该阶段时,物理中的表),并且可以独立存在。

    • 对于与这些子类型有关系的较低级别或事务或功能表,技巧是使用正确的子类型(角色)。常见的错误是他们使用了Party,然后使用了子类型或角色的含义,并且丢失了正确的引用完整性。

    • 所有 RoleName 均派生Party,但这并不是使用Party 代替正确名称的正当理由角色。

    • 在这里,您非常了解数据,但是(没有人教过您这一点并且)您对角色和子类型感到困惑。

    • BandMemberFan 不是派对。首先,他们是 Person(而 Person 是一个 Party,第二)

  3. 为了清楚地说明这些要点,在这个概念上在级别上,我们需要使用实体和标识符(而不是属性),而不仅仅是实体。因此我也提供了这一点。

    • 看来您拥有 ERwin(最好的!);它允许您非常方便地查看该级别的单个模型。即使在这个抽象级别,也要在实体中实现标识符。

障碍

我在介绍模型之前指出了这些,因为您似乎对学习 IDEF1X(关系数据库建模的标准方法)非常感兴趣,以期将来简化您的模型。 SO 或任何网站都不是正式互动教育的良好媒介,但我们会尽力而为。

  1. 在模型 (1) 中,Band 不能是独立的(方角):因为它被识别为依赖于 Party;它是Party的子类型;并且它具有相同的标识符。

  2. 缺少的基数至关重要。将其放入实际上将有助于解决模型。我对 IDEF1X(圆圈)和 IEEE(鱼尾纹)并不大惊小怪,但我总是边走边把它们放进去,并随着模型的进展不断改变它们。

    • 您的模型没有显示Band是由一对多成员组成的。等等。
      .
      虽然编程可以逐步进步(一旦定义稳定),但建模却不能。例如。你不能对实体建模,但不能对关系建模;关系,但不是基数。这就是为什么它们是不同的科学,程序员不能成为优秀的建模者,反之亦然。
      .
  3. 在这个阶段,规则也非常重要。建模实际上就是对规则进行建模。因此,纠正或调整规则是建模过程的一部分。

    • 粉丝可以是任何乐队的粉丝,也可以不是任何乐队的粉丝是不合理的。如果一个根本就没有,那么他们就是公众的一员,并且他们与任何乐队没有关系。一个普通的

    • 粉丝与至少一个乐队有关系。事实上,与Band 的关系可以将Person 移出该领域,并导致Fan 详细信息或特定粉丝的存储-乐队详细信息。

    • 如果存在诸如Fan with no Band这样的实体(即您正在存储其详细信息,根据我的模型与Fan分开) ,请指教,我会改变模型(纸张很便宜!)。

  4. 动词短语在此阶段也很重要;正如我上面关于规则和基数的观点一样,它是建模过程的一部分,并且随着模型的进展需要更改/调整。您不会相信正确使用动词短语有多么重要。将它们放入很可能有助于您澄清子类型与角色。这是每个数据建模者都熟知的定义。

    • 实体是模型中的名词

    • 关系是动词,即名词之间发生的动作

    • 动词短语定义了这些动作(这就是为什么它们被准确地称为动词短语,这不是一个有趣的名字)。

IDEF1X 符号<中所述/strong> 文档,对于关联表,将动词短语“穿过”它们,读到关联另一侧的父级。

  • 创建一对多乐队,因此是成员

  • 创建了一个乐队由一对多的人员组成,他们是成员

  • 一个光顾一个 Band,这使他们成为一个 Fan(而不仅仅是在 Fan 表中拥有一行的 Person

  • Band 取决于 People,即 Fans

想出最短、最有意义的动词短语;不使用简单的单词(要避免“包含”),对建模者来说是一个挑战。请随意改进我提供的动词短语。

这是您的IDEF1X 中实体和键级别的派对数据模型

不熟悉的读者关系数据库建模标准可能会找到我的 IDEF1X 符号 有用。

注意

我所做的只是解析了角色的基数,如上面所确定的

  1. 当您这样做时,您会更加清楚子类型与角色的相关性 。评估您的第二部分或您的交易实体(正如您所说,这里我们只有识别实体)。

  2. 标识符。这一点值得详细说明,不仅是为了阐明模型,还因为它是使用 IDEF1X 标识符及其部署功能的一个很好的示例。您已经在模型中表明您了解(实线),我只是对其进行全面处理。

    • PersonBandParty 的子类型。它们也是Party的角色。因此,我们从该点向下使用 PersonIdBandId,而不是 PartyId(即使它是 PartyId)。< /p>

    • Person扮演Member角色时,我们使用MemberId(即PersonId ,即 PartyId)。

    • Person扮演粉丝的角色时,我们使用FanId(即PersonId ,即 PartyId)。

假设您要列出 BandFans,您的查询以 Fan 为中心。如果您在每张桌子上都有这些Id代理键,您将被迫加入Person,然后加入Party。但使用您拥有的关系标识符,您可以直接转到 Party:

SELECT  ...,
        Name  -- Party.Name
        ...
    FROM Party
    JOIN Fan
        ON PartyId = FanId

并跳过中间的 Band 表。是的,事实是,规范化关系数据库需要更少的连接、更少的资源(处理、缓存、磁盘 I/O),这就是它们性能如此出色的原因之一。这个神话没有科学依据。

请评估并提出具体问题。

更新

有关子类型的一般处理以及真正 SQL 平台的实现细节,请参阅我的 子类型文档

Caveat

  1. First a couple of caveats for understanding the limitations. All the data being used or stored needs to be considered/modelled together. Eg. you have found that out anyway in your "creating confusion in expanding the model". From my side, not knowing how the Parties (subtypes) are related to other entities, limits me from provided a totally correct answer (that will not change).

    Since you are providing the data in two tranches, therefore the answer will be in two tranches, and the second tranche will require changes to the first model. Not a complaint, just advising you beforehand, as that could be avoided if I saw all the data up front.

  2. It is really great that you appreciate the need for (a) modelling the data and (b) going through the science (documented for over 30 years) of Conceptual, Logical, then physical. That is exactly what I do. You cannot imagine the time and effort that is saved by the formal process.

    • However, that does not come across in my answers at SO, because it is a "question and answer" site, I have to answer at the level of the questioner. (I answer questions more fully than others, and even that, causes negative commentary !). Be assured that I go through the formal sequence.
  3. It must be mentioned that Relation Modelling is the work of giants in the industry, Dr E F Codd and R Brown in the 1980's. The Methodology is based on their work. IDEF1X became a NIST Standard in 1993. When I answer Data Modelling questions, I am not supplying some personal method that I wrote a book about, I stand on the shoulders of giants.

    • I adhere to the Relational Model by Dr E F Codd.

    • I reject the work of Date; Darwen; Fagin; etc, because it contradicts the Relational Model, it is anti-Relational.

    • ERD (P Chen) is pre-relational, pre-IDEF1X, and primitive in comparison. It has not concept of the Identifier, it cannot handle a Relational Key. It is staggering that ERD is still being taught as "relational", and IDEF1X is suppressed. For 30 years.

Data Modelling

  1. Yes, a Supertype-Subtype structure is correct here. Unfortunately it is not common and therefore not commonly understood.
  • Subtypes existed long before the Relational Model, and continues to exist. IDEF1X, the only Standard for Relational Data Modelling, has specific symbols for it. ERD has nothing.
  1. Separately, even where Subtypes are implemented, they are implemented in very limited fashion. Subtypes have the effect of changing the Role of the Supertype. The correct implementation of that is very rare indeed, I have not seen this anywhere (except of course my own database implementations for customers, and a few high-end suppliers).

  2. Point is, that may look "complex" but it isn't, it is actually very simple.

    This is where Ken Downs and Chris Behrens confuse modelled simplicity (highly extensible) with unmodelled implementation (incorrect and un-extensible), due to the simplistic approach advised by dwarves such as Martin Fowler. No offence, I understand that people are attached to, and will defend, what they know, however limited as that may be.

    • notice that each Subtype is also a perfectly valid Entity (Table in the Physical, when we get to that stage) in its own right, and can stand on its own.

    • for the lower levels or transaction or function tables, which have Relations to these Subtypes, the trick is to use the correct Subtype (Role). The common mistake is they use Party, and then the meaning of the Subtype or Role, and the correct Referential Integrity is lost.

    • separately all the RoleNames are derived from Party, but that is not a valid reason to useParty instead of the correct Role.

    • Here you understand the data really well, but (no one has taught you this and) you have confused Roles and Subtypes.

    • BandMember and Fan are not Parties. They are Persons, first (and Person is a Party, second)

  3. In order to provide clarity re those points, at this Conceptual level, we need to work with Entities and Identifiers (not Attributes), rather than just Entities. Therefore I have provided that as well.

    • It appears you have ERwin (the best!); it allows you to view the single model at that level very conveniently. Do implement the Identifiers in the Entities, even at this abstract level.

Obstacle

I point these out before presenting the model, because you appear to be seriously interested in learning IDEF1X, the Standard methodolgy for modelling Relational Databases, with a view to easing your models in the future. SO or any website, is not a good medium for formal interactive education, but we will give it our best shot.

  1. In model (1), Band cannot be Independent (square corners): since it is identified as being Dependent on Party; it is a Subtype of Party; and it has the same Identifier.

  2. The missing Cardinality is critical. Putting it in will actually assist in resolving the model. I am not fussed re IDEF1X (circles) vs IEEE (crows feet), but I always put them in as I go, and keep changing them as the model progresses.

    • your model does not show that a Band is made up of one-to-many Members. Et cetera.
      .
      While programming can progress incrementally (once the definition is stable)), modelling does not. Eg. you can't model the Entities but not the Relations; the Relations but not the Cardinality. That is why they are different sciences, programmers do not make good modellers, and vice versa.
      .
  3. At this stage the Rules are also very important. Modelling is, in fact, modelling the Rules. Therefore correcting or modulating the Rules is part of the Modelling process.

    • A Fan can be a fan of of any Band or none at all is not reasonable. If a Person is none at all then they are a member of the general public, and they have no Relation to any Band. An ordinary Person.

    • A Fan has a Relation to at least one Band. In fact, having a Relation to a Band is what takes a Person out of that realm and causes storage of Fan details or specific fan-of-band details.

    • If there are is such an Entity as Fan with no Band (ie. you are storing details of that, separate to Fan as per my model), please advise, and I will change the model (paper is cheap!).

  4. Verb Phrases are also important at this stage; no less than my point re Rules and Cardinality above, it is a part of the modelling process, and it needs change/modulation as the model progresses. You will not believe how important it is to gett the Verb Phrases right. Putting them in may well have assisted you in clarifying Subtypes vs Roles. Here is a definition that every Data Modeller knows by heart.

    • The Entities are the Nouns in the model

    • The Relations are the Verbs, the actions that take place between the Nouns

    • The Verb Phrases define those actions (that's why they are accurately called Verb Phrases, it is not a funny name).

As described in the IDEF1X Notation document, for Associative tables, read the Verb Phrase "through" them, to the parent on the other side of the association.

  • A Person makes one-to-many Bands, and is thus a Member

  • A Band is made up of one-to-many People, who are Members

  • A Person patronises a Band, which makes them a Fan (not merely a Person who has a row in the Fan table)

  • A Band depends on People, who are Fans

Coming up with the shortest, most meaningful, Verb Phrase; not using simplistic words ("comprises" is to be avoided"), is a challenge for Modellers. Feel free to ameliorate the Verb phrases I supplied.

Here is your Party Data Model at the Entity and Key levels in IDEF1X.

Readers who are unfamiliar with the Standard for Modelling Relational Databases may find my IDEF1X Notation useful.

Note

All I have done is resolve the Subtypes; Roles; the Cardinality of the Relations, as identified above.

  1. The relevance of the Subtypes vs Roles will be more clear to you when you evaluate your second tranche or your transaction Entities (as you have stated, here we have only Identifying Entities).

  2. Identifiers. This is worth spelling out, not only for the purpose of clarifying the model, also because it is a good example of IDEF1X Identifiers being used, and the power they deploy. You have already indicated in your model that you understand that (solid lines), I am merely giving it the full treatment.

    • Person and Band are Subtypes of Party. They are also Roles of Party. Therefore we use PersonId and BandId from that point downward, not PartyId (even though it is PartyId).

    • When a Person plays the Role of Member, we use MemberId (which is PersonId, which is PartyId).

    • When a Person plays the Role of Fan, we use FanId (which is PersonId, which is PartyId).

Let's say you were listing the Fans of a Band, your query is centred on Fan. If you had those Id Surrogate Keys on every table, you would be forced to join Person, then join Party. But with the Relational Identifiers you have, you can go directly to Party:

SELECT  ...,
        Name  -- Party.Name
        ...
    FROM Party
    JOIN Fan
        ON PartyId = FanId

and skip the Band table in-between. Yes, the truth is, Normalised Relational Database require fewer joins, less resources (processing, cache, disk I/O), and that is one reason why they perform so much better. The myth has no scientific basis.

Please evaluate and ask specific questions.

Update

For a generic treatment of Subtypes, and implementation details for genuine SQL platforms, refer to my Subtype Document.

我的鱼塘能养鲲 2024-10-20 11:34:13

我认为这比你想象的要简单。您有两个对象 - Band 和 Person,它们可以通过两种不同的方式连接,无论是作为粉丝还是作为成员。这是一个没有外键或任何内容的快速数据库脚本:

CREATE TABLE [dbo].[XREFBandMembers](
    [MemberID] [int] NOT NULL,
    [BandId] [int] NOT NULL,
 CONSTRAINT [PK_XREFBandMembers] PRIMARY KEY CLUSTERED 
(
    [MemberID] ASC,
    [BandId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[XREFBandFans](
    [FanId] [int] NOT NULL,
    [BandId] [int] NOT NULL,
 CONSTRAINT [PK_XREFBandFans] PRIMARY KEY CLUSTERED 
(
    [FanId] ASC,
    [BandId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[People](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [nvarchar](100) NOT NULL,
 CONSTRAINT [PK_People] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[Bands](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [nvarchar](50) NOT NULL,
 CONSTRAINT [PK_Bands] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

至于关系特定属性,您可以将它们放在 XREF 表中,例如,FanClubMembershipNumber 放在 XREFBandFans 中。

I think this is simpler than you think. You've got two objects - Band and Person, and they can be connected in two different ways, either as a fan or a member. Here is a quickie db script with no foreign keys or anything:

CREATE TABLE [dbo].[XREFBandMembers](
    [MemberID] [int] NOT NULL,
    [BandId] [int] NOT NULL,
 CONSTRAINT [PK_XREFBandMembers] PRIMARY KEY CLUSTERED 
(
    [MemberID] ASC,
    [BandId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[XREFBandFans](
    [FanId] [int] NOT NULL,
    [BandId] [int] NOT NULL,
 CONSTRAINT [PK_XREFBandFans] PRIMARY KEY CLUSTERED 
(
    [FanId] ASC,
    [BandId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[People](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [nvarchar](100) NOT NULL,
 CONSTRAINT [PK_People] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[Bands](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [nvarchar](50) NOT NULL,
 CONSTRAINT [PK_Bands] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

As for the relationship specific attributes, you can place them in the XREF tables, e.g., FanClubMembershipNumber goes in XREFBandFans.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文