如何使用非唯一外键强制数据库完整性?
我想要一个数据库表来保存具有修订历史记录的数据(例如维基百科上的页面)。我认为一个好主意是使用两列来标识行:(name, version)
。因此,示例表如下所示:
TABLE PERSONS:
id: int,
name: varchar(30),
version: int,
... // some data assigned to that person.
因此,如果用户想要更新人员的数据,他们不会进行更新 - 相反,他们会创建一个具有相同 name
但不同 的新 PERSONS 行>版本
值。向用户显示的数据(对于给定的名称
)是具有最高版本
的数据。
我有第二个表,例如 DOGS,它引用 PERSONS 表中的人员:
TABLE DOGS:
id: int,
name: varchar(30),
owner_name: varchar(30),
...
显然,owner_name
是对 PERSONS.name
的引用,但我无法将其声明为外国键(在 MS SQL Server 中),因为 PERSONS.name
不是唯一的!
问题:那么,在 MS SQL Server 2008 中,我应该如何确保数据库完整性(即,对于每个 DOG,在 PERSONS 中至少存在一行,使其 PERSON.name == DOG。所有者名称)?
我正在寻找最优雅的解决方案——我知道我可以在 PERSONS 表上使用触发器,但这并不像我希望的那样具有声明性和优雅性。有什么想法吗?
附加信息
上面的设计有以下优点,如果我需要,我可以“记住”一个人当前的id
(或(名称,版本) 对),我确信该行中的数据永远不会更改。这很重要,例如,如果我将这个人的数据作为文档的一部分,然后打印出来,并且在 5 年后,有人可能想打印一份完全相同的副本(例如,使用与今天相同的数据),那么这将非常容易供他们做。
也许您可以想到一种完全不同的设计来实现相同的目的,并且可以更容易地强制执行其完整性(最好使用外键或其他约束)?
编辑:感谢迈克尔·加图索的回答,我发现了描述这种关系的另一种方式。有两种解决方案,我将其作为答案发布。请投票选出您更喜欢哪一位。
I want to have a database table that keeps data with revision history (like pages on Wikipedia). I thought that a good idea would be to have two columns that identify the row: (name, version)
. So a sample table would look like this:
TABLE PERSONS:
id: int,
name: varchar(30),
version: int,
... // some data assigned to that person.
So if users want to update person's data, they don't make an UPDATE -- instead, they create a new PERSONS row with the same name
but different version
value. Data shown to the user (for given name
) is the one with highest version
.
I have a second table, say, DOGS, that references persons in PERSONS table:
TABLE DOGS:
id: int,
name: varchar(30),
owner_name: varchar(30),
...
Obviously, owner_name
is a reference to PERSONS.name
, but I cannot declare it as a Foreign Key (in MS SQL Server), because PERSONS.name
is not unique!
Question: How, then, in MS SQL Server 2008, should I ensure database integrity (i.e., that for each DOG, there exists at least one row in PERSONS such that its PERSON.name == DOG.owner_name)?
I'm looking for the most elegant solution -- I know I could use triggers on PERSONS table, but this is not as declarative and elegant as I want it to be. Any ideas?
Additional Information
The design above has the following advantage that if I need to, I can "remember" a person's current id
(or (name, version)
pair) and I'm sure that data in that row will never be changed. This is important e.g. if I put this person's data as part of a document that is then printed and in 5 years someone might want to print a copy of it exactly unchanged (e.g. with the same data as today), then this will be very easy for them to do.
Maybe you can think of a completely different design that achieves the same purpose and its integrity can be enforced easier (preferably with foreign keys or other constraints)?
Edit: Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, which I posted as answers. Please vote which one you like better.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
我参与的一个项目解决了类似的问题。这是一个生物记录数据库,随着新研究提高了对分类学的理解,物种名称可能会随着时间的推移而改变。
然而,旧记录需要保持与原始物种名称的相关性。它变得很复杂,但基本的解决方案是拥有一个仅包含所有唯一物种名称的 NAME 表、一个代表实际物种的物种表和一个将两者链接在一起的 NAME_VERSION 表。在任何时候都会有一个首选名称(即该物种当前接受的学名),它是 name_version 中保存的布尔字段。
在您的示例中,这将转换为“详细信息”表(detailsid、otherdetails 列)、一个名为“DetailsVersion”(detailsid、personid)的链接表和一个“人员表”(personid、不变数据)。将狗与人联系起来。
A project I have worked on addressed a similar problem. It was a biological records database where species names can change over time as new research improved understanding of taxonomy.
However old records needed to remain related to the original species names. It got complicated but the basic solution was to have a NAME table that just contained all unique species names, a species table that represented actual species and a NAME_VERSION table that linked the two together. At any one time there would be a preferred name (ie the currently accepted scientific name for the species) which was a boolean field held in name_version.
In your example this would translate to a Details table (detailsid, otherdetails columns) a link table called DetailsVersion (detailsid, personid) and a Person Table (personid, non-changing data). Relate dogs to Person.
人员
ID (int),
姓名,
......
activeVersion(这将是来自 personVersionInfo 的 UID)
注意:上表中每个人都有 1 行。将拥有创建该人的原始信息。
PersonVersionInfo
UID(识别人员+版本的唯一标识符),
id(整数),
姓名,
......
versionId(将为每个人生成)
Dogs
DogID,
狗名
......
PersonsWithDogs
UID,
DogID
编辑:您必须加入 PersonWithDogs、PersionVersionInfo、Dogs 才能了解完整情况(截至今天)。这种结构将帮助您将狗链接到所有者(具有特定版本)。
如果人的信息发生变化,并且您希望获得与狗相关的最新信息,则必须更新 PersonWithDogs 表以获得给定狗所需的(人的)UID。
您可以设置一些限制,例如 DogID 在 PersonWithDogs 中应该是唯一的。
而在这个结构中,一个UID(人)可以有很多个Dog。
您的场景(可以更改/限制等)将有助于更好地设计模式。
Persons
id (int),
name,
.....
activeVersion (this will be UID from personVersionInfo)
note: Above table will have 1 row for each person. will have original info with which person was created.
PersonVersionInfo
UID (unique identifier to identify person + version),
id (int),
name,
.....
versionId (this will be generated for each person)
Dogs
DogID,
DogName
......
PersonsWithDogs
UID,
DogID
EDIT: You will have to join PersonWithDogs, PersionVersionInfo, Dogs to get the full picture (as of today). This kind of structure will help you link a Dog to the Owner (with a specific version).
In case the Person's info changes and you wish to have latest info associated with the Dog, you will have to Update PersonWithDogs table to have the required UID (of the person) for the given Dog.
You can have restrictions such as DogID should be unique in PersonWithDogs.
And in this structure, a UID (person) can have many Dogs.
Your scenarios (what can change/restrictions etc) will help in designing the schema better.
感谢迈克尔·加图索的回答,我发现了描述这种关系的另一种方式。有两个解决方案,这是第一个。请投票选出您更喜欢哪一位。
解决方案 1
在 PERSONS 表中,我们仅保留姓名(唯一标识符)和指向当前人员数据的链接:
我们创建一个新表 PERSONS_DATA,其中包含所有数据历史记录对于那个人:
DOGS 表保持不变,它仍然指向一个人的名字(FK 到 PERSONS 表)。
优点:对于每只狗,至少存在一个 PERSONS_DATA 行,其中包含其所有者的数据(这就是我想要的)
缺点:如果要更改一个人的数据,则必须:
Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, this is the first of them. Please vote which one you like better.
Solution 1
In PERSONS table, we leave only the name (unique identifier) and a link to current person's data:
We create a new table, PERSONS_DATA, that contains all data history for that person:
DOGS table stays the same, it still points to a person's name (FK to PERSONS table).
ADVANTAGE: for each dog, there exists at least one PERSONS_DATA row that contains data of its owner (that's what I wanted)
DISADVANTAGE: if you want to change a person's data, you have to:
感谢迈克尔·加图索的回答,我发现了描述这种关系的另一种方式。有两种解决方案,这是其中的第二个。请投票选出您更喜欢哪一位。
解决方案 2
在 PERSONS 表中,我们仅保留姓名(唯一标识符)和指向第一个(不是当前!)人员数据的链接:
我们创建一个新表 PERSONS_DATA ,包含该人的所有数据历史记录:
DOGS 表保持不变,它仍然指向一个人的姓名(FK 到 PERSONS 表)。
优点:
缺点:要检索当前人员的数据,我必须:
你觉得怎么样?
Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, this is the second of them. Please vote which one you like better.
Solution 2
In PERSONS table, we leave only the name (unique identifier) and a link to the first (not current!) person's data:
We create a new table, PERSONS_DATA, that contains all data history for that person:
DOGS table stays the same, it still points to a person's name (FK to PERSONS table).
ADVANTAGES:
DISADVANTAGE: to retrieve current person's data, I have to either:
What do you think?
在您的父表中,创建对(id,版本)的唯一约束。将 version 列添加到您的子表,并使用检查约束来确保它始终为 0。使用 FK 约束将 (parentid, version) 映射到您的父表。
In your parent table, create a unique constraint on (id, version). Add version column to your child table, and use a check constraint to make sure that it is always 0. Use a FK constraint to map (parentid, version) to your parent table.
或者,您可以为具有历史价值的数据维护一个人员历史表。通过这种方式,您可以保持“人和狗”表格的整洁和参考的简单,同时还可以访问历史上有趣的信息。
Alternatively you could maintain a person history table for the data that has historic value. This way you keep your Persons and Dogs table tidy and the references simple but also have access to the historically interesting information.
好的,第一件事是您需要规范化您的表。谷歌“数据库规范化”,你会找到大量的阅读材料。 PERSONS 表尤其需要注意。
第二件事是,当您创建外键引用时,99.999% 的情况下您想要引用 ID(数字)值。即,[DOGS].[owner] 应该是对 [PERSONS].[id] 的引用。
编辑:添加示例架构(原谅松散的语法)。我假设每只狗都只有一个主人。这是实现人员历史记录的一种方式。所有列都不为空。
最新版本的数据将直接存储在 Persons 表中,较旧的数据存储在 PersonHistory 表中。
Okay, first thing is that you need to normalize your tables. Google "database normalization" and you'll come up with plenty of reading. The PERSONS table, in particular, needs attention.
Second thing is that when you're creating foreign key references, 99.999% of the time you want to reference an ID (numeric) value. I.e., [DOGS].[owner] should be a reference to [PERSONS].[id].
Edit: Adding an example schema (forgive the loose syntax). I'm assuming each dog has only a single owner. This is one way to implement Person history. All columns are not-null.
The latest version of the data would be stored in the Persons table directly, with older data stored in the PersonHistory table.
我会使用关联表将多个版本链接到一个 pk。
I would use and association table to link the many versions to the one pk.