如何使用非唯一外键强制数据库完整性？

发布于 2024-08-07 03:33:39 字数 1313 浏览 3 评论 0原文

我想要一个数据库表来保存具有修订历史记录的数据（例如维基百科上的页面）。我认为一个好主意是使用两列来标识行：(name, version)。因此，示例表如下所示：

TABLE PERSONS:
    id:      int,
    name:    varchar(30),
    version: int,
    ... // some data assigned to that person.

因此，如果用户想要更新人员的数据，他们不会进行更新 - 相反，他们会创建一个具有相同 name 但不同 的新 PERSONS 行>版本值。向用户显示的数据（对于给定的名称）是具有最高版本的数据。

我有第二个表，例如 DOGS，它引用 PERSONS 表中的人员：

TABLE DOGS:
    id:         int,
    name:       varchar(30),
    owner_name: varchar(30),
    ...

显然，owner_name 是对 PERSONS.name 的引用，但我无法将其声明为外国键（在 MS SQL Server 中），因为 PERSONS.name 不是唯一的！

问题：那么，在 MS SQL Server 2008 中，我应该如何确保数据库完整性（即，对于每个 DOG，在 PERSONS 中至少存在一行，使其 PERSON.name == DOG。所有者名称）？

我正在寻找最优雅的解决方案——我知道我可以在 PERSONS 表上使用触发器，但这并不像我希望的那样具有声明性和优雅性。有什么想法吗？

附加信息

上面的设计有以下优点，如果我需要，我可以“记住”一个人当前的id（或（名称，版本）对），我确信该行中的数据永远不会更改。这很重要，例如，如果我将这个人的数据作为文档的一部分，然后打印出来，并且在 5 年后，有人可能想打印一份完全相同的副本（例如，使用与今天相同的数据），那么这将非常容易供他们做。

也许您可以想到一种完全不同的设计来实现相同的目的，并且可以更容易地强制执行其完整性（最好使用外键或其他约束）？

编辑：感谢迈克尔·加图索的回答，我发现了描述这种关系的另一种方式。有两种解决方案，我将其作为答案发布。请投票选出您更喜欢哪一位。

原文

I want to have a database table that keeps data with revision history (like pages on Wikipedia). I thought that a good idea would be to have two columns that identify the row: (name, version). So a sample table would look like this:

TABLE PERSONS:
    id:      int,
    name:    varchar(30),
    version: int,
    ... // some data assigned to that person.

So if users want to update person's data, they don't make an UPDATE -- instead, they create a new PERSONS row with the same name but different version value. Data shown to the user (for given name) is the one with highest version.

I have a second table, say, DOGS, that references persons in PERSONS table:

TABLE DOGS:
    id:         int,
    name:       varchar(30),
    owner_name: varchar(30),
    ...

Obviously, owner_name is a reference to PERSONS.name, but I cannot declare it as a Foreign Key (in MS SQL Server), because PERSONS.name is not unique!

Question: How, then, in MS SQL Server 2008, should I ensure database integrity (i.e., that for each DOG, there exists at least one row in PERSONS such that its PERSON.name == DOG.owner_name)?

I'm looking for the most elegant solution -- I know I could use triggers on PERSONS table, but this is not as declarative and elegant as I want it to be. Any ideas?

Additional Information

The design above has the following advantage that if I need to, I can "remember" a person's current id (or (name, version) pair) and I'm sure that data in that row will never be changed. This is important e.g. if I put this person's data as part of a document that is then printed and in 5 years someone might want to print a copy of it exactly unchanged (e.g. with the same data as today), then this will be very easy for them to do.

Maybe you can think of a completely different design that achieves the same purpose and its integrity can be enforced easier (preferably with foreign keys or other constraints)?

Edit: Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, which I posted as answers. Please vote which one you like better.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

人海汹涌 2024-08-14 03:33:40

我参与的一个项目解决了类似的问题。这是一个生物记录数据库，随着新研究提高了对分类学的理解，物种名称可能会随着时间的推移而改变。

然而，旧记录需要保持与原始物种名称的相关性。它变得很复杂，但基本的解决方案是拥有一个仅包含所有唯一物种名称的 NAME 表、一个代表实际物种的物种表和一个将两者链接在一起的 NAME_VERSION 表。在任何时候都会有一个首选名称（即该物种当前接受的学名），它是 name_version 中保存的布尔字段。

在您的示例中，这将转换为“详细信息”表（detailsid、otherdetails 列）、一个名为“DetailsVersion”（detailsid、personid）的链接表和一个“人员表”（personid、不变数据）。将狗与人联系起来。

回复收藏 0 原文

浪推晚风 2024-08-14 03:33:40

人员

ID (int)，
姓名，
......
activeVersion（这将是来自 personVersionInfo 的 UID）

注意：上表中每个人都有 1 行。将拥有创建该人的原始信息。

PersonVersionInfo

UID（识别人员+版本的唯一标识符），
id（整数），
姓名，
......
versionId（将为每个人生成）

Dogs

DogID，
狗名
......

PersonsWithDogs

UID，
DogID

编辑：您必须加入 PersonWithDogs、PersionVersionInfo、Dogs 才能了解完整情况（截至今天）。这种结构将帮助您将狗链接到所有者（具有特定版本）。

如果人的信息发生变化，并且您希望获得与狗相关的最新信息，则必须更新 PersonWithDogs 表以获得给定狗所需的（人的）UID。

您可以设置一些限制，例如 DogID 在 PersonWithDogs 中应该是唯一的。
而在这个结构中，一个UID（人）可以有很多个Dog。

您的场景（可以更改/限制等）将有助于更好地设计模式。

回复收藏 0 原文

无法回应 2024-08-14 03:33:40

感谢迈克尔·加图索的回答，我发现了描述这种关系的另一种方式。有两个解决方案，这是第一个。请投票选出您更喜欢哪一位。

解决方案 1

在 PERSONS 表中，我们仅保留姓名（唯一标识符）和指向当前人员数据的链接：

TABLE PERSONS:
    name:            varchar(30),
    current_data_id: int

我们创建一个新表 PERSONS_DATA，其中包含所有数据历史记录对于那个人：

TABLE PERSONS_DATA:
    id:      int
    version: int (auto-generated)
    ... // some data, like address, etc.

DOGS 表保持不变，它仍然指向一个人的名字（FK 到 PERSONS 表）。

优点：对于每只狗，至少存在一个 PERSONS_DATA 行，其中包含其所有者的数据（这就是我想要的）

缺点：如果要更改一个人的数据，则必须：

添加一个新的 PERSONS_DATA
行为此更新 PERSONS 条目person 指向新的 PERSONS_DATA 行。

Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, this is the first of them. Please vote which one you like better.

Solution 1

In PERSONS table, we leave only the name (unique identifier) and a link to current person's data:

TABLE PERSONS:
    name:            varchar(30),
    current_data_id: int

We create a new table, PERSONS_DATA, that contains all data history for that person:

TABLE PERSONS_DATA:
    id:      int
    version: int (auto-generated)
    ... // some data, like address, etc.

DOGS table stays the same, it still points to a person's name (FK to PERSONS table).

ADVANTAGE: for each dog, there exists at least one PERSONS_DATA row that contains data of its owner (that's what I wanted)

DISADVANTAGE: if you want to change a person's data, you have to:

add a new PERSONS_DATA row
update PERSONS entry for this person to point to the new PERSONS_DATA row.

回复收藏 0 原文

徒留西风 2024-08-14 03:33:40

感谢迈克尔·加图索的回答，我发现了描述这种关系的另一种方式。有两种解决方案，这是其中的第二个。请投票选出您更喜欢哪一位。

解决方案 2

在 PERSONS 表中，我们仅保留姓名（唯一标识符）和指向第一个（不是当前！）人员数据的链接：

TABLE PERSONS:
    name:            varchar(30),
    first_data_id: int

我们创建一个新表 PERSONS_DATA ，包含该人的所有数据历史记录：

TABLE PERSONS_DATA:
    id:      int
    name:    varchar(30)
    version: int (auto-generated)
    ... // some data, like address, etc.

DOGS 表保持不变，它仍然指向一个人的姓名（FK 到 PERSONS 表）。

优点：

对于每只狗，至少存在一个 PERSONS_DATA 行，其中包含其所有者的数据（这就是我想要的），
如果我想更改一个人的数据，我不必更新 PERSONS 行，只需添加一个新的 PERSONS_DATA行

缺点：要检索当前人员的数据，我必须：

选择具有给定名称和最高版本的 PERSONS_DATA （可能很昂贵）
选择具有特殊版本的 PERSONS_DATA，例如“-1”，但随后我必须分别更新两个 PERSONS_DATA 行当我添加新的 PERSONS_DATA 时，在这个解决方案中我想避免更新 2 行......

你觉得怎么样？

Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, this is the second of them. Please vote which one you like better.

Solution 2

In PERSONS table, we leave only the name (unique identifier) and a link to the first (not current!) person's data:

TABLE PERSONS:
    name:            varchar(30),
    first_data_id: int

We create a new table, PERSONS_DATA, that contains all data history for that person:

TABLE PERSONS_DATA:
    id:      int
    name:    varchar(30)
    version: int (auto-generated)
    ... // some data, like address, etc.

DOGS table stays the same, it still points to a person's name (FK to PERSONS table).

ADVANTAGES:

for each dog, there exists at least one PERSONS_DATA row that contains data of its owner (that's what I wanted)
if I want to change a person's data, I don't have to update the PERSONS row, only add a new PERSONS_DATA row

DISADVANTAGE: to retrieve current person's data, I have to either:

choose PERSONS_DATA with given name and highest version (may be expensive)
choose PERSONS_DATA with special version, e.g. "-1", but then I would have to update two PERSONS_DATA rows each time I add new PERSONS_DATA, and in this solution I wanted to avoid having to update 2 rows...

What do you think?

回复收藏 0 原文

旧城烟雨 2024-08-14 03:33:39

在您的父表中，创建对（id，版本）的唯一约束。将 version 列添加到您的子表，并使用检查约束来确保它始终为 0。使用 FK 约束将 (parentid, version) 映射到您的父表。

回复收藏 0 原文

跨年 2024-08-14 03:33:39

或者，您可以为具有历史价值的数据维护一个人员历史表。通过这种方式，您可以保持“人和狗”表格的整洁和参考的简单，同时还可以访问历史上有趣的信息。

回复收藏 0 原文

浸婚纱 2024-08-14 03:33:39

好的，第一件事是您需要规范化您的表。谷歌“数据库规范化”，你会找到大量的阅读材料。 PERSONS 表尤其需要注意。

第二件事是，当您创建外键引用时，99.999% 的情况下您想要引用 ID（数字）值。即，[DOGS].[owner] 应该是对 [PERSONS].[id] 的引用。

编辑：添加示例架构（原谅松散的语法）。我假设每只狗都只有一个主人。这是实现人员历史记录的一种方式。所有列都不为空。

Persons Table:
int Id
varchar(30) name
...

PersonHistory Table:
int Id
int PersonId (foreign key to Persons.Id)
int Version (auto-increment)
varchar(30) name
...

Dogs Table:
int Id
int OwnerId (foreign key to Persons.Id)
varchar(30) name
...

最新版本的数据将直接存储在 Persons 表中，较旧的数据存储在 PersonHistory 表中。

Okay, first thing is that you need to normalize your tables. Google "database normalization" and you'll come up with plenty of reading. The PERSONS table, in particular, needs attention.

Second thing is that when you're creating foreign key references, 99.999% of the time you want to reference an ID (numeric) value. I.e., [DOGS].[owner] should be a reference to [PERSONS].[id].

Edit: Adding an example schema (forgive the loose syntax). I'm assuming each dog has only a single owner. This is one way to implement Person history. All columns are not-null.

Persons Table:
int Id
varchar(30) name
...

PersonHistory Table:
int Id
int PersonId (foreign key to Persons.Id)
int Version (auto-increment)
varchar(30) name
...

Dogs Table:
int Id
int OwnerId (foreign key to Persons.Id)
varchar(30) name
...

The latest version of the data would be stored in the Persons table directly, with older data stored in the PersonHistory table.

回复收藏 0 原文