如何实现这种多表数据库设计/约束,规范化?

发布于 2024-08-20 04:21:34 字数 1483 浏览 5 评论 0原文

我的数据有点像这样...

Elements
Class | Synthetic ID (pk)
A     | 2
A     | 3
B     | 4
B     | 5
C     | 6
C     | 7

Elements_Xref
ID (pk) | Synthetic ID | Real ID (fk)
.       | 2            | 77-8F         <--- A class
.       | 3            | 30-7D         <--- A class
.       | 6            | 21-2A         <--- C class
.       | 7            | 30-7D         <--- C class

所以我有这些元素,它们被分配了合成 ID 并分组为类。但这些合成 ID 会与我们真正关心的真实 ID 配对。还有一个限制,即真实 ID 不能在单个类中重复出现。我怎样才能在一个连贯的设计中捕捉到所有这些?

我不想将真实 ID 塞到上表中,因为

  1. 它可以为空(有时我们不知道某物的真实 ID 应该是什么)。
  2. 它是更多数据的外键。

显然,这可以通过充当约束的触发器来完成,但我想知道这是否可以通过常规约束/唯一索引来实现。使用SQL Server 2005

我考虑过拥有两个主表 SyntheticByClassRealByClass,然后将这些表的 ID 放入另一个外部参照/链接表中,但这仍然不能保证这些类两个元素匹配。也可以通过触发器解决。

编辑:这是关键字填充,但我认为它与标准化有关。

编辑^2:如下面的评论所示,我似乎暗示外键不能为空。这是错误的,他们可以!但不能做的是在重复 NULL 的字段上设置唯一索引。尽管唯一索引支持 NULL 值,但它们不能在一组中约束多个 NULL。由于真实 ID 分配最初是稀疏的,因此每个类很可能有多个 NULL 真实 ID。

编辑 ^3:删除了多余的 Elements.ID 列。

编辑^4:一般观察。似乎有三种主要方法在起作用,我已经提到了其中一种。

  1. 触发器。使用触发器作为约束来中断任何可能破坏数据完整性的数据操作。
  2. 为连接表的视图建立索引。太棒了,我不知道您可以使用视图和索引来做到这一点。
  3. 创建多列外键。没有想到这样做,也不知道这是可能的。将Class 字段添加到Xref 表中。创建对 (Class + Real ID) 的 UNIQUE 约束和对 (Class + Synthetic ID) 的外键约束返回到 Elements 表。

I have data that kinda looks like this...

Elements
Class | Synthetic ID (pk)
A     | 2
A     | 3
B     | 4
B     | 5
C     | 6
C     | 7

Elements_Xref
ID (pk) | Synthetic ID | Real ID (fk)
.       | 2            | 77-8F         <--- A class
.       | 3            | 30-7D         <--- A class
.       | 6            | 21-2A         <--- C class
.       | 7            | 30-7D         <--- C class

So I have these elements that are assigned synthetic IDs and are grouped into classes. But these synthetic IDs are then paired with Real IDs that we actually care about. There is also a constraint that a Real ID cannot recur in a single class. How can I capture all of this in one coherent design?

I don't want to jam the Real ID into the upper table because

  1. It is nullable (there are periods where we don't know what the Real ID of something should be).
  2. It's a foreign key to more data.

Obviously this could be done with triggers acting as constraints, but I'm wondering if this could be implemented with regular constraints/unique indexes. Using SQL Server 2005.

I've thought about having two main tables SyntheticByClass and RealByClass and then putting IDs of those tables into another xref/link table, but that still doesn't guarantee that the classes of both elements match. Also solvable via trigger.

Edit: This is keyword stuffing but I think it has to do with normalization.

Edit^2: As indicated in the comments below, I seem to have implied that foreign keys cannot be nullable. Which is false, they can! But what cannot be done is setting a unique index on fields where NULLs repeat. Although unique indexes support NULL values, they cannot constraint more than one NULL in a set. Since the Real ID assignment is initially sparse, multiple NULL Real IDs per class is more than likely.

Edit^3: Dropped the redundant Elements.ID column.

Edit^4: General observations. There seems to be three major approaches at work, one of which I already mentioned.

  1. Triggers. Use a trigger as a constraint to break any data operations that would corrupt the integrity of the data.
  2. Index a view that joins the tables. Fantastic, I had no idea you could do that with views and indexes.
  3. Create a multi-column foreign key. Didn't think of doing this, didn't know it was possible. Add the Class field to the Xref table. Create a UNIQUE constraint on (Class + Real ID) and a foreign key constraint on (Class + Synthetic ID) back to the Elements table.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

冧九 2024-08-27 04:21:34

在该问题成为“奖励”问题之前的评论

您想要做的是表达 Elements 和 Elements_Xref 的连接对 Class 和 Real ID 有唯一的约束。如果您有一个支持 SQL-92 ASSERTION 约束的 DBMS,您就可以做到。

AFAIK,没有 DBMS 支持它们,所以你不得不使用触发器。

奇怪的是,该设计并没有限制真实 ID 在不同类别中是唯一的。从讨论来看,给定的真实 ID 似乎可以属于多个不同类别。如果真实 ID“除非为空则唯一”,那么如果 DBMS 支持“除非为空则唯一”概念(大多数不支持;我相信有一个可以,但我忘记了),那么您将能够更轻松地强制执行唯一性它就是)。


编辑前的评论 2010-02-08

该问题排除了上表(元素)中“干扰”Real_ID 的可能性;它不排除将 Class 包含在下表 (Elements_Xref) 中,然后允许您在 Elements_Xref 中的 Class 和 Real_ID 上创建唯一索引,从而实现(我相信)所需的结果。

从示例数据中尚不清楚 Elements 表中的合成 ID 是否唯一,或者是否可以在不同的类中重复(或者,实际上合成 ID 是否可以在单个类中重复)。鉴于似乎有一个 ID 列(大概是唯一的)以及合成 ID 列,似乎可以合理地假设有时合成 ID 会重复 - 否则表中没有很好的理由有两个唯一列。在大多数情况下,这并不重要 - 但如果将类复制到 Elements_Xref 表,它确实会影响唯一性约束。又一种可能;也许元素表中根本不需要该类;它应该只存在于 Elements_Xref 表中。我们没有足够的信息来判断这是否有可能。


所做更改的注释 2010-02-08

现在 Elements 表已将 Synthetic ID 作为主键,事情变得更容易了。有评论说“班级”信息实际上是“月份”,但我会尝试忽略它。

在 Elements_Xref 表中,我们有一个唯一的 ID 列,然后是一个合成 ID(它没有标记为 Elements 的外键,但可能实际上必须是一个),以及真实 ID。从示例数据中我们可以看到,多个合成 ID 可以映射到给定的真实 ID。目前尚不清楚为什么 Elements_Xref 表同时具有 ID 列和合成 ID 列。

我们不知道单个合成 ID 是否只能映射到单个真实 ID,或者是否可以映射到多个真实 ID 值。

由于 Synthetic ID 是 Elements 的主键,因此我们知道单个 Synthetic ID 对应于单个 Class。

我们不知道合成 ID 到真实 ID 的映射是否会随时间变化(可能是类与日期相关),以及是否必须记住旧状态。

我们可以假设表格已减少到最低限度,并且每个表格中还有其他列,其内容对问题并不直接重要。

该问题表明真实 ID 是其他数据的外键并且可以为 NULL。

我看不到一个完全有效的非冗余设计。

我认为 Elements_Xref 表应包含:

  • 合成 ID
  • 真实 ID

,其中(合成 ID,类)作为引用元素的“外键”,以及对真实 ID 的 NOT NULL 约束,以及对(类,真实 ID)的唯一约束。

Elements_Xref 表仅包含真实 ID 已知的行,并正确强制执行所需的唯一性约束。

奇怪的是,Elements_Xref 中的(综合 ID,类)数据必须与 Elements 中的相同列匹配,即使综合 ID 是 Elements 的主键。

在 IBM Informix Dynamic Server 中,您可以实现这一点:

CREATE TABLE elements
(
    class CHAR(1) NOT NULL,
    synthetic_id SERIAL NOT NULL PRIMARY KEY,
    UNIQUE(class, synthetic_id)
);

CREATE TABLE elements_xref
(
    class CHAR(1) NOT NULL,
    synthetic_id INTEGER NOT NULL REFERENCES elements(synthetic_id),
    FOREIGN KEY (class, synthetic_id) REFERENCES elements(class, synthetic_id),
    real_id    CHAR(5) NOT NULL,
    PRIMARY KEY (class, real_id)
);

Comments from before the question was made into a 'bonus' question

What you'd like to be able to do is express that the join of Elements and Elements_Xref has a unique constraint on Class and Real ID. If you had a DBMS that supported SQL-92 ASSERTION constraints, you could do it.

AFAIK, no DBMS supports them, so you are stuck with using triggers.

It seems odd that the design does not constrain Real ID to be unique across classes; from the discussion, it seems that a given Real ID could be part of several different classes. Were the Real ID 'unique unless null', then you would be able to enforce the uniqueness more easily, if the DBMS supported the 'unique unless null' concept (most don't; I believe there is one that does, but I forget which it is).


Comments before edits made 2010-02-08

The question rules out 'jamming' the Real_ID in the upper table (Elements); it doesn't rule out including the Class in the lower table (Elements_Xref), which then allows you to create a unique index on Class and Real_ID in Elements_Xref, achieving (I believe) the required result.

It isn't clear from the sample data whether the synthetic ID in the Elements table is unique or whether it can repeat with different classes (or, indeed whether a synthetic ID can be repeated in a single class). Given that there seems to be an ID column (which presumably is unique) as well as the Synthetic ID column, it seems reasonable to suppose that sometimes the synthetic ID repeats - otherwise there are two unique columns in the table for no very good reason. For the most part, it doesn't matter - but it does affect the uniqueness constraint if the class is copied to the Elements_Xref table. One more possibility; maybe the Class is not needed in the Elements table at all; it should live only in the Elements_Xref table. We don't have enough information to tell whether this is a possibility.


Comments for changes made 2010-02-08

Now that the Elements table has the Synthetic ID as the primary key, things are somewhat easier. There's a comment that the 'Class' information actually is a 'month', but I'll try to ignore that.

In the Elements_Xref table, we have an unique ID column, and then a Synthetic ID (which is not marked as a foreign key to Elements, but presumably must actually be one), and the Real ID. We can see from the sample data that more than one Synthetic ID can map to a given Real ID. It is not clear why the Elements_Xref table has both the ID column and the Synthetic ID column.

We do not know whether a single Synthetic ID can only map to a single Real ID or whether it can map to several Real ID values.

Since the Synthetic ID is the primary key of Elements, we know that a single Synthetic ID corresponds to a single Class.

We don't know whether the mapping of Synthetic ID to Real ID varies over time (it might as Class is date-related), and whether the old state has to be remembered.

We can assume that the tables are reduced to the bare minimum and that there are other columns in each table, the contents of which are not directly material to the question.

The problem states that the Real ID is a foreign key to other data and can be NULL.

I can't see a perfectly non-redundant design that works.

I think that the Elements_Xref table should contain:

  • Synthetic ID
  • Class
  • Real ID

with (Synthetic ID, Class) as a 'foreign key' referencing Elements, and a NOT NULL constraint on Real ID, and a unique constraint on (Class, Real ID).

The Elements_Xref table only contains rows for which the Real ID is known - and correctly enforces the uniqueness constraint that is needed.

The weird bit is that the (Synthetic ID, Class) data in Elements_Xref must match the same columns in Elements, even though the Synthetic ID is the primary key of Elements.

In IBM Informix Dynamic Server, you can achieve this:

CREATE TABLE elements
(
    class CHAR(1) NOT NULL,
    synthetic_id SERIAL NOT NULL PRIMARY KEY,
    UNIQUE(class, synthetic_id)
);

CREATE TABLE elements_xref
(
    class CHAR(1) NOT NULL,
    synthetic_id INTEGER NOT NULL REFERENCES elements(synthetic_id),
    FOREIGN KEY (class, synthetic_id) REFERENCES elements(class, synthetic_id),
    real_id    CHAR(5) NOT NULL,
    PRIMARY KEY (class, real_id)
);
生寂 2024-08-27 04:21:34

我会:

  1. 在 Elements(Synthetic ID, Class) 上创建一个 UNIQUE 约束
  2. 将 Class 列添加到 Elements_Xref
  3. 在 Elements_Xref 表上添加一个 FOREIGN KEY 约束,引用 (Synthetic ID, Class)

此时我们确信 Elements_Xref.Class 始终匹配元素.类。

现在我们需要实现“非空时唯一”逻辑。点击链接并滚动到“使用计算列实施复杂业务规则”部分:
计算列上的索引:加快查询速度,添加业务规则

或者,您可以在其 WHERE 子句中使用 WHERE RealID IS NOT NULL 在 (Class, RealID) 上创建索引视图 - 这也将强制执行“非空时唯一”逻辑。

I would:

  1. Create a UNIQUE constraint on Elements(Synthetic ID, Class)
  2. Add Class column to Elements_Xref
  3. Add a FOREIGN KEY constraint on Elements_Xref table, referring to (Synthetic ID, Class)

At this point we know for sure that Elements_Xref.Class always matches Elements.Class.

Now we need to implement "unique when not null" logic. Follow the link and scroll to section "Use Computed Columns to Implement Complex Business Rules":
Indexes on Computed Columns: Speed Up Queries, Add Business Rules

Alternatively, you can create an indexed view on (Class, RealID) with WHERE RealID IS NOT NULL in its WHERE clause - that will also enforce "unique when not null" logic.

困倦 2024-08-27 04:21:34

使用“Where Real_Id Is Not Null”为 Elements_Xref 创建索引视图,然后在该视图上创建唯一索引。

Create View Elements_Xref_View With SchemaBinding As
Select Elements.Class, Elements_Xref.Real_Id
From Elements_Xref
Inner Join Element On Elements.Synthetic_Id = Elements_Xref.Synthetic_Id
Where Real_Id Is Not Null
Go

Create Unique Clustered Index Elements_Xref_Unique_Index
On Elements_Xref_View (Class, Real_Id)
Go

除了模拟正确处理空值的唯一索引(即 null != null)之外,这没有其他用途。

Create an indexed view for Elements_Xref with Where Real_Id Is Not Null and then create a unique index on that view

Create View Elements_Xref_View With SchemaBinding As
Select Elements.Class, Elements_Xref.Real_Id
From Elements_Xref
Inner Join Element On Elements.Synthetic_Id = Elements_Xref.Synthetic_Id
Where Real_Id Is Not Null
Go

Create Unique Clustered Index Elements_Xref_Unique_Index
On Elements_Xref_View (Class, Real_Id)
Go

This serves no other purpose other than simulating a unique index that treats nulls properly i.e. null != null

软的没边 2024-08-27 04:21:34

您可以

  1. 从在Synthetic ID上将Elements_XrefElements连接在一起的结果集中创建视图

    创建

  2. class 上添加唯一约束> 和[真实身份]。在其他新闻中,这也是您在 MSSQL 中通过索引视图来执行功能索引的方式。

下面是一些 sql:

CREATE VIEW unique_const_view AS
SELECT e.[Synthetic ID], e.Class, x.[Real ID]
FROM Elements AS e
JOIN [Elements_Xref] AS x
  ON e.[Synthetic ID] = x.[Synthetic ID]

CREATE UNIQUE INDEX unique_const_view_index ON unique_const_view ( Class, [Real ID] );

现在,显然,我自己不知道这个解决方案在 Microsoft 的土地上不起作用,因为使用 MS SQL Server 重复空值将违反 UNIQUE 约束:这违反了 SQL 规范。 这是问题讨论了

这是 Microsoft 的解决方法:

create unique nonclustered index idx on dbo.DimCustomer(emailAddress)
where EmailAddress is not null;

不确定是 2005 年还是 2008 年。

You can

  1. Create a view from the the result set of joining Elements_Xref and Elements together on Synthetic ID

  2. add a unique constraint on class, and [Real ID]. In other news, this is also how you do functional indexes in MSSQL, by indexing views.

Here is some sql:

CREATE VIEW unique_const_view AS
SELECT e.[Synthetic ID], e.Class, x.[Real ID]
FROM Elements AS e
JOIN [Elements_Xref] AS x
  ON e.[Synthetic ID] = x.[Synthetic ID]

CREATE UNIQUE INDEX unique_const_view_index ON unique_const_view ( Class, [Real ID] );

Now, apparently, unbeknownst to myself this solution doesn't work in Microsoft-land-place because with MS SQL Server duplicate nulls will violate a UNIQUE constraint: this is against the SQL spec. This is where the problem is discussed about.

This is the Microsoft workaround:

create unique nonclustered index idx on dbo.DimCustomer(emailAddress)
where EmailAddress is not null;

Not sure if that is 2005, or just 2008.

叹梦 2024-08-27 04:21:34

我认为触发器是你最好的选择。约束不能交叉到其他表来获取信息。与唯一索引相同(尽管我认为带有索引的物化视图可能是可能的),它们在表中是唯一的。将触发器组合在一起时,请记住以基于集合的方式进行操作,而不是逐行进行操作,并使用多行插入和多行更新进行测试,其中真实键在数据集中重复。

I think a trigger is your best option. Constraints can't cross to other tables to get information. Same thing with a unique index (although I suppose a materialized view with an index might be possible), they are unique within the table. When you put the trigger together, remember to do it in a set-based fashion not row-by-row and test with a multi-row insert and multi-row update where the real key is repeated in the dataset.

深居我梦 2024-08-27 04:21:34

我认为您的两个原因中的任何一个都不会成为将真实ID放入Elements的障碍。如果给定元素有 0 或 1 个真实 ID(但绝不会超过 1 个),则它绝对应该位于 Elements 表中。这将允许您限制类内的唯一性(我认为)。

您能否详细说明一下不这样做的两个原因?

I don't think either of your two reasons are an obstacle to putting Real ID in Elements. If a given element has 0 or 1 Real IDs (but never more than 1), it should absolutely be in the Elements table. This would then allow you to constrain uniqueness within Class (I think).

Could you expand on your two reasons not to do this?

坠似风落 2024-08-27 04:21:34

创建一个新表 real_elements,其中包含字段 Real ID、Class 和 Synthetic ID,主键为 Class、RealId,并在实际添加 RealID 时添加元素。

这限制 Real ID 对于类来说是唯一的,并为您提供了一种匹配类的方法真实 ID 与合成 ID

至于真实 ID 是外键,您的意思是如果它属于两个类,那么锁定的数据将是相同的。如果是这样,请添加另一个带有密钥 Real Id 的表。该键是 real_elements 和任何其他需要真实 ID 作为外键的表的外键

Create a new table real_elements with fields Real ID, Class and Synthetic ID with a primary key of Class, RealId and add elements when you actually add a RealID

This constrains Real IDs to be unique for a class and gives you a way to match a class and real ID to the synthetic ID

As for Real ID being a foreign key do you mean that if it is in two classes then the data keyed off it will be the same. If so the add another table with key Real Id. This key is then a foreign key into real_elements and any other table needing real ID as foreign key

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文