SQL - 这个关系可以有主键吗?

发布于 2024-10-13 08:44:02 字数 317 浏览 10 评论 0原文

乐队和成员之间的关系“MemberOf”包括成员id(Mid)、乐队id(Bid)、“StartYear”、“EndYear”和演奏的“Instrument”。 Mid 和 Bid 分别是 Member 和 Band 关系的外键,但我正在努力确定这种情况下“MemberOf”关系的主键。

注意:如果成员从开始就在乐队中,则开始年份可以为空,如果成员仍在乐队中,则结束年份也可以为空。

在这种情况下,开始和结束年份不能用作 PK,因为它们可能包含空值,但成员 id 和乐队 id 的信息不足以保证唯一性 - 例如,同一成员稍后返回同一乐队。

这是否意味着不存在主键?谢谢。

The relationship between bands and members, 'MemberOf', includes a member id (Mid), band id (Bid), 'StartYear', 'EndYear' and 'Instrument' played. Mid and Bid are both foreign keys to Member and Band relations respectively but I am struggling to identify the primary key in this situation for the 'MemberOf' relation.

Note: start year can be null if a member was in a band from the start and end year can also be null if a member is still in the band.

In this situation, start and end year cannot be used as PK's as they may contain null values, but the member id and band id is not enough information to be unique - e.g. the same member returns to the same band at a later date.

Does this mean no primary key exists? Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

一杯敬自由 2024-10-20 08:44:02

这是一个常见问题。为了了解发生了什么,我们需要将关系概念“主键”从 SQL 构造 PRIMARY KEY 中分离出来。

关系概念“主键”是表中行的唯一标识符。在您的表的上下文中,乐队和成员显然是主键的一部分,成员为乐队演奏的时间段也是主键的一部分。一名成员可以在 2003 年至 2005 年为特定乐队演奏,在 2006 年至 2007 年开始独唱生涯,并在 2008 年至 2011 年重新加入原来的乐队。您需要为表中的成员输入两个条目,每个条目对应两个期间。

理想情况下,您可以将“startYear”和“endYear”组成单个“memberFor”期间列,主键为(bid、mid、memberFor),对于 bid+mid 的单个组合,memberFor 值表示不相交范围:即,对于两行 R1、R2,值 R1.memberFor 不重叠或接触 R2.memberFor。重叠示例:R1.memberFor = (2003-2005),R2.Memberfor = (2004-2006)。触摸示例:R1.memberFor = (2003-2005),R2.MemberFor = (2005-2007)。

理论就讲这么多。

实际上,SQL 不支持时间段,在此上下文中的时间段具有开始时间和持续时间。 INTERVAL 类型不符合条件,因为它们表示没有开始的持续时间,并且 DATE、TIME、TIMESTAMP 类型不符合条件,因为它们缺乏持续时间(忽略与当前问题无关的量化问题)。显然,您还可以通过 (duration, end) 和 (start, end) 来表示一段时间,在后一种情况下,您可以为开始和结束设置开、闭、开-闭或闭-开范围。

因为SQL不支持必要的类型,更不用说必要的检查选项,所以你必须自己检查,而且检查起来很复杂。 SQL 主键仅查找值的相等性。不受约束,这意味着如果您在 (bid, mid, start) 或 (bid, mid, end) 或 (bid, mid, start, end) 创建主键,则最终可能会在表中得到不需要的数据,并且满足SQL 主键,但不是概念主键。

Bid     Mid     Start   End
  1       1      2003   2005   - α
  1       1      2004   2006   - β
  1       1      2004   2007   - γ
  1       1      2008   2011   - δ

认为标记为 α 的行是正确的。如果您的 SQL 主键为 (bid, mid, start),则不应允许标记为 β 的行,因为它与 α 重叠; γ 也类似。然而,SQL仅阻止β和γ同时出现;第一个输入的可以,第二个输入的将被拒绝。如果允许 γ,则 ∂ 不应该被允许,因为它接触 γ。类似地,如果 SQL 主键处于打开状态(bid、mid、start、end),则所有四行都将被允许进入表中,但显然它们不应该被允许。如果 SQL 主键处于打开状态(bid、mid、end),则以上所有行都将被允许进入。

因此,您必须使用过滤掉不需要的行的复杂查询来增强 SQL 主键的条件。这可能是在表的插入触发器和更新触发器中使用的存储过程。

...留给读者练习的详细信息...

在您的表格中,您有基于年份的粒度。因此,您可能会使用仅存储 (mid, bid, memberInYear) 的修订表,并且当每个乐队成员是给定乐队的成员时,每年为他们保留一行。这会减少一个大的时间段,其中三列上的 SQL 主键强制执行约束。但是,如果您更改表的粒度来记录开始和结束日期(精确到当天),那么显然修改后的设计效果不佳。

因此,您可能应该使用 (bid、mid、start) 上的 SQL 主键,该主键由从插入和更新触发器调用的存储过程备份,该存储过程强制执行非重叠和非接触标准。

This is a common problem. To see what's going on, we need to disentangle the relational concept 'primary key' from the SQL construct PRIMARY KEY.

The relational concept 'primary key' is a unique identifier for the rows in a table. In the context of your table, the band and the member are clearly part of the primary key, and so too is the period during which the member played for the band. A single member could play for a specific band in, say, 2003-2005, go off on a solo career for 2006-2007, and rejoin his original band for 2008-2011. You'd need two entries for the member in the table, one for each of the two periods.

Ideally, you'd be able to say 'startYear' and 'endYear' form a single 'memberFor' period column and the primary key is (bid, mid, memberFor) and for a single combination of bid+mid, the memberFor values represent disjoint ranges: that is, for two rows R1, R2, the value R1.memberFor does not overlap or touch R2.memberFor. Example overlap: R1.memberFor = (2003-2005), R2.Memberfor = (2004-2006). Example touch: R1.memberFor = (2003-2005), R2.MemberFor = (2005-2007).

So much for the theory.

In practice, SQL does not support time periods, where a time period in this context has a start and a duration. The INTERVAL types do not qualify because they represent duration without the start, and the DATE, TIME, TIMESTAMP types do not qualify because they lack the duration (ignoring issues of quantization which are not germane to the matter at hand). Clearly, you can also represent a period by (duration, end) and also by (start, end), and in the latter case, you can have open, closed, open-closed or closed-open ranges for the start and end.

Because SQL does not support the requisite type, much less the necessary checking options, you have to do the checking yourself, and it is complex to do it. The SQL primary key only looks for equality of values. Unconstrained, it means that if you create the primary key on (bid, mid, start) or (bid, mid, end) or (bid, mid, start, end), you can end up with unwanted data in the table which satisfies the SQL primary key but not the conceptual primary key.

Bid     Mid     Start   End
  1       1      2003   2005   - α
  1       1      2004   2006   - β
  1       1      2004   2007   - γ
  1       1      2008   2011   - δ

Consider the row tagged α as correct. If your SQL primary key is on (bid, mid, start), then the row tagged β should not be allowed because it overlaps with α; similarly for γ. However, SQL only prevents β and γ both appearing; the one entered first will be OK and the one entered second will be rejected. If γ were allowed, then ∂ should not be allowed because it touches γ. Similarly, if the SQL primary key were on (bid, mid, start, end), all four rows would be allowed into the table, but clearly they should not be. And if the SQL primary key was on (bid, mid, end), all the rows above would be allowed in.

So, you would have to augment the criteria for the SQL primary key with a complex query that filters out unwanted rows. This would like be stored procedure used in an insert trigger for the table and in an update trigger.

…details left as exercise for reader…

In your table, you have a year-based granularity. You could probably, therefore, get away with a revised table that stored just (mid, bid, memberInYear), and have one row for each band member for each year when they were a member of the given band. This reduces to a large quantum time period where the SQL primary key on the three columns enforces the constraint. However, if you change the granularity of the table to record the start and end dates (down to the day), then clearly the revised design does not work well.

So, you should probably go with an SQL primary key on (bid, mid, start) backed up by a stored procedure invoked from insert and update triggers that enforces the non-overlap and non-touching criteria.

寄人书 2024-10-20 08:44:02

注意:开始年份可以为空,如果
成员从一开始就在乐队中

好吧,这个荒谬的要求是你问题的最大部分。存储年份。

{BandId, MemberId, StartYear}为主键。

不过,这不适用于二月份开始、三月份退出、七月份重新开始的会员。

Note: start year can be null if a
member was in a band from the start

Well, that ridiculous requirement is the biggest part of your problem. Store the year.

{BandId, MemberId, StartYear} as the primary key.

That won't accommodate members who start in February, quit in March, and start again in July, though.

梦亿 2024-10-20 08:44:02

你是对的,日期不能被视为 PK 的一部分,因为它们是会随着时间而变化的数据。我认为你有两个选择:

1)将 MemberOf 记录拆分为带有子表的父子对,可能称为 MemberOfYears,包含 Mid、Bid 和 Year,父级 MemberOf 就变成 Mid & 。出价。因此,该成员是该特定乐队成员的每一年都会包含一张 MemberOfYears 记录。

2) 向 MemberOf 关系表添加一个人工键并将其用作主键。

You are right that the dates cannot be considered part of the PK as they are data that will change over time. I think you have two options:

1) Split the MemberOf record into a parent child pair with the child table, maybe called MemberOfYears, containing Mid, Bid, and a Year and the parent, MemberOf, just becomes Mid & Bid. So one MemberOfYears record would be included of each Year the member was a member of that particular band.

2) Add an artificial key to the MemberOf relationship table and use it as the primary.

少女七分熟 2024-10-20 08:44:02

一种可能的解决方案是将 StartYear 值存储为乐队开始的年份。这样,您可以使用(MId、BId、StartYear)作为主键假设成员每年只能加入和退出一次(按照您的设计建议)。

要测试“原始成员”,请将 MemberOf.StartYear 与 Band.StartYear 进行比较以确定是否相等。

否则不行,您没有有效的主键,需要添加某种消除歧义的列(JoinNumber 或其他列),这似乎是不需要的额外工作。

One possible solution would be to store the StartYear value as the year the band was started. That way you could use (MId, BId, StartYear) as a primary key assuming members can only join and quit once per year (as suggested by your design).

To test for "original member" you compare MemberOf.StartYear with Band.StartYear for equality.

Otherwise no, you don't have a valid primary key and need to add some kind of disambiguating column (JoinNumber or something), which seems like undesirable extra effort.

我一直都在从未离去 2024-10-20 08:44:02

alt text

  • 人类历史上的一切都发生在某个时间点,因此 StartYear 可以不是 NULL——真的。

如果您引入 IsCurrent 而不是 EndYear,那么根本就没有 NULL。如果您确实选择保留 EndYear,通常的做法是加载某个遥远未来的日期,例如 3000-01-01。最后,如果您确实保留并保留 EndYear NULL —— 这并不是悲剧,唯一的问题是可能很难区分丢失的数据和当前的成员资格。

注意:StartDate 可能比 StartYear 更合适。

alt text

  • Everything in human history happened at some point in time, so StartYear can not be NULL -- really.

If you introduce IsCurrent instead the EndYear then there are no NULLs at all. If you do choose to keep the EndYear, it is common practice to load it with some date in far future, like 3000-01-01. And finally, if you do keep and leave EndYear NULL -- it is not tragic, the only problem being that it may be hard to distinguish between missing data and the current membership.

Note: StartDate may be more appropriate than the StartYear.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文