我可以让一个标识字段跨越 SQL Server 中的多个表吗？

发布于 2024-08-24 19:02:45 字数 431 浏览 2 评论 0原文

我可以有一个跨多个表的“身份”（唯一的、非重复的）列吗？例如，假设我有两个表：书籍和作者。

Authors
  AuthorID
  AuthorName
Books
  BookID
  BookTitle

BookID 列和AuthorID 列是标识列。我希望身份部分跨越两列。因此，如果存在值为 123 的 AuthorID，则不可能存在值为 123 的 BookID。反之亦然。

我希望这是有道理的。

这可能吗？

谢谢。

我为什么要这样做？我正在编写一个 APS.NET MVC 应用程序。我正在创建一个评论部分。作者可以提出意见。书可以有评论。我希望能够将实体 ID（书籍 ID 或作者 ID）传递给操作，并让该操作提取所有相应的评论。该动作不会关心它是一本书还是一个作者或其他什么。听起来合理吗？

原文

Can I have an "identity" (unique, non-repeating) column span multiple tables?
For example, let's say I have two tables: Books and Authors.

Authors
  AuthorID
  AuthorName
Books
  BookID
  BookTitle

The BookID column and the AuthorID column are identity columns.
I want the identity part to span both columns.
So, if there is an AuthorID with a value of 123, then there cannot be a BookID with a value of 123. And vice versa.

I hope that makes sense.

Is this possible?

Thanks.

Why do I want to do this? I am writing an APS.NET MVC app. I am creating a comment section. Authors can have comments. Books can have comments. I want to be able to pass an entity ID (a book ID or an author ID) to an action and have the action pull up all the corresponding comments. The action won't care if it's a book or an author or whatever. Sound reasonable?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

柳絮泡泡 2024-08-31 19:02:45

即使您可以将标识序列放在多个表中，您的注释表也无法在单个外键中引用两个列。

根据关系数据库设计理论，最好的方法是创建两个评论表。但显然，您希望避免这种情况，可能是出于代码重用的原因。

最直接的实用方法是在评论表上放置两个外键列，并为每个评论设置一个为空，另一个不为空。

另一种方法（可能是最好的折衷方案）是这样的。您在问题中提到“实体 ID”。所以制作一个实体表！那么作者、书籍和评论都可以参考那个表。

编辑添加：

Philip Kelley、Ray 和（我认为）Artic 都建议通过添加 entity_id 来修改评论表，该实体可以引用 book_id 或 author_id，以及某种标志（char(1)、tinyint 和 boolean< /code> 分别），指示正在引用其中的哪一个。

出于多种原因，无论是实用性（包括数据完整性、报告、效率）还是理论上，这都不是一个好的解决方案。

第一个也是最明显的问题是数据完整性问题。关系数据库系统应该始终负责维护其自身数据的完整性，并且数据库设计有自然且首选的方式来做到这一点。这些机制中最重要的之一是外键系统。如果 comment.entity_id 列要引用 book.book_id 和 author.author_id，则无法为此列创建外键。

当然，您可以在 DML（插入、更新、删除）存储过程中进行检查来验证引用，但这很快就会变得一团糟，因为所有三个表上的所有 DML 操作都会涉及到。

这给我们带来了效率问题。每当针对 comment 表运行查询时，都需要连接到 author 或 book 表或两者。查询计划生成系统将没有可用于优化的外键，因此其性能很可能会下降。

那么这个方案在报道中就存在问题。任何报告生成系统都会遇到此类系统的问题。当然，这对于专业程序员来说不会是问题，但是任何用户临时报告都必须模拟 event_id 表示这个或那个时背后的逻辑，这可能是一个漂亮的结果。糟糕的交易。也许您永远不会在该数据库上使用报告生成工具。但话又说回来，没有人知道数据库最终将用在哪里。为什么不与系统合作以允许任何事情发生呢？

这给我们带来了理论问题。

在关系数据库理论中，每个表（“关系变量”）中的每一行（也称为“元组”）代表关于现实世界的一个命题。设计表格就是决定该命题的形式。让我们看几个例子来说明它是如何工作的。

comment (comment_id int, comment_type char(1), entity_id int, 
         user_id int, comment_text nvarchar(max), comment_date datetime)
/* comment_id identifies a comment (comment_text) that a user (user_id) 
   has made about a book (entity_id if comment_type = 'B') or author 
   (entity_id if comment_type = 'A') at a particular date and 
   time (comment_date).*/

很明显，名为 entity_id 的列（或“属性”）正在执行双重任务。除了引用另一列之外，它实际上并不代表任何内容。这是可行的，但并不令人满意。

comment (comment_id int, book_id int, author_id int, user_id int, 
         comment_text nvarchar(max), comment_date datetime)
/* comment_id identifies a comment (comment_text) that a user (user_id) 
   has made about a book (book_id if not null) or author (author_id if 
   not null) at a particular date and time (comment_date). */

这为我们购买了外键，这是第一个版本中最大的遗漏。但这仍然不是很令人满意，除非一条评论可以同时指一本书和一个作者（这可能是合理的）。可空列是一个警告信号，表明设计存在问题，这里也可能出现这种情况。检查约束可能是必要的，以避免评论根本没有提及任何内容，或者如果不允许的话，则可以同时提及一本书和作者。

从理论角度（以及我的角度:)）来看，有一个明显的最佳选择：

book_comment (book_comment_id int, book_id int, user_id int, 
              comment_text nvarchar(max), comment_date datetime)
/* book_comment_id identifies a comment (comment_text) that a 
   user (user_id) has made about a book (book_id) at a particular 
   date and time (comment_date). */

author_comment (author_comment_id int, author_id int, user_id int, 
                comment_text nvarchar(max), comment_date datetime)
/* author_comment_id identifies a comment (comment_text) that a 
   user (user_id) has made about an author (author_id) at a particular 
   date and time (comment_date). */

最后一个选项将提供最佳效率、数据完整性和报告简易性。唯一的代价是 DML 存储过程需要将注释放入正确的表中，这不是什么大问题，因为它们必须知道注释所指的内容。

如果您的计划是立即检索一本书或作者的所有评论，那么您可以轻松地在这些表之上创建一个视图来重现其他设计（如果您想要这样做）。

create view comments as 
select 
    book_comment_id as comment_id, 
    book_id as entity_id, 
    comment_text,
    'B' as comment_type
from book_comment
union
select 
    author_comment_id as comment_id, 
    author_id as entity_id, 
    comment_text,
     'A' as comment_type 
from author_comment

Even if you could put the identity sequence across multiple tables, your comment table is not going to be able to reference both columns in a single foreign key.

The best way to do this, in terms of relational database design theory, would be to create two comment tables. But obviously, you want to avoid that, probably for code-reuse reasons.

The most straightforward pragmatic approach would be to put two foreign key columns on the comment table, and just make one null and the other not null for each comment.

Another approach, which might be the best compromise, is this. You refer in your question to an "entity ID". So make an Entity table! Then the authors and books and comments can all refer to that table.

Edited to add:

Philip Kelley, Ray, and (I think) Artic have all suggested modifying the comment table by adding an entity_id, which can refer to either the book_id or the author_id, and a flag of some sort (char(1), tinyint, and boolean, respectively) that indicates which of these is being referred to.

This is not a good solution for many reasons, both pragmatic (including data integrity, reporting, efficiency) and theoretical.

The first and most obvious problem is the data integrity problem. A relational database system should always be responsible for maintaining the integrity of its own data, and there are natural and preferred ways that the DB is designed to do this. One of the most important of these mechanisms is the foreign key system. If the comment.entity_id column is to reference both book.book_id and author.author_id, then a foreign key cannot be created for this column.

Sure, you could put a check in your DML (insert, update, delete) stored procedures to verify the references, but that would quickly turn into a big mess, as all DML operations on all three tables would be involved.

And that leads us to the efficiency problem. Whenever a query is run against the comment table, it will require joins to either the author or book table or both. The query plan generation system will not have foreign keys available to optimize with, so its performance could very well be degraded.

Then there are problems with this scheme in reporting. Any report generating system is going to have trouble with this sort of system. Sure this won't be a problem for expert programmers, but any user ad-hoc reports are going to have to mock up the logic behind when the event_id means this or that, and it could be a pretty bad deal. Maybe you won't ever use report generating tools on this database. But then again, nobody knows where a database is going to be ultimately used. Why not work with the system to allow for anything?

And that leads us to the theoretical problems.

In relational database theory, each row (a.k.a. "tuple") in each table ("relation variable") represents a proposition about the real world. Designing a table is to decide the form of that proposition. Let's look at a few examples of how this might work.

comment (comment_id int, comment_type char(1), entity_id int, 
         user_id int, comment_text nvarchar(max), comment_date datetime)
/* comment_id identifies a comment (comment_text) that a user (user_id) 
   has made about a book (entity_id if comment_type = 'B') or author 
   (entity_id if comment_type = 'A') at a particular date and 
   time (comment_date).*/

Here it is clear that the column (or "attribute") called entity_id is doing double-duty. It doesn't really represent anything, except with reference to another column. This is workable, but unsatisfactory.

comment (comment_id int, book_id int, author_id int, user_id int, 
         comment_text nvarchar(max), comment_date datetime)
/* comment_id identifies a comment (comment_text) that a user (user_id) 
   has made about a book (book_id if not null) or author (author_id if 
   not null) at a particular date and time (comment_date). */

This buys us the foreign keys that are the biggest omission from the first version. But this still isn't terribly satisfactory, unless a single comment can refer to both a book and an author (which might be reasonable). Nullable columns are a warning sign that something is wrong with the design, and that may be the case here as well. A check constraint may be necessary to avoid a comment that refers to nothing at all, or to both a book and an author if that is not to be allowed.

From a theoretical perspective (and thus, my perspective :)) there is a clear best option:

book_comment (book_comment_id int, book_id int, user_id int, 
              comment_text nvarchar(max), comment_date datetime)
/* book_comment_id identifies a comment (comment_text) that a 
   user (user_id) has made about a book (book_id) at a particular 
   date and time (comment_date). */

author_comment (author_comment_id int, author_id int, user_id int, 
                comment_text nvarchar(max), comment_date datetime)
/* author_comment_id identifies a comment (comment_text) that a 
   user (user_id) has made about an author (author_id) at a particular 
   date and time (comment_date). */

This last option would provide the best efficiency, data integrity, and ease of reporting. And the only expense would be that the DML stored procedures would need to put the comments into the right tables, which is not a big deal, since they had to know what the comments were referring to anyway.

If your plan was to retrive all comments for a book or author at once, then you can easily create a view on top of these tables that reproduces the other designs, if that's what you want to do.

create view comments as 
select 
    book_comment_id as comment_id, 
    book_id as entity_id, 
    comment_text,
    'B' as comment_type
from book_comment
union
select 
    author_comment_id as comment_id, 
    author_id as entity_id, 
    comment_text,
     'A' as comment_type 
from author_comment

回复收藏 0 原文

静谧幽蓝 2024-08-31 19:02:45

实际上，Joe Celko 建议这个 blog 在数据库中使用自定义序列，然后，对于所需表的任何主键，指定其默认值以从自定义序列中获取下一个数字。

以下是他博客中的代码示例：

CREATE SEQUENCE Service_Ticket_Seq
 AS INTEGER
 START WITH 1
 INCREMENT BY 1
 MINVALUE 1
 MAXVALUE 100
 CYCLE;

CREATE TABLE Meats
(ticket_seq INTEGER DEFAULT NEXT VALUE FOR Service_Ticket_Seq
       PRIMARY KEY,
 meat_type VARCHAR(15) NOT NULL);

CREATE TABLE Fish
(ticket_seq INTEGER DEFAULT NEXT VALUE FOR Service_Ticket_Seq
       PRIMARY KEY,
 fish_type VARCHAR(15) NOT NULL);

INSERT INTO Meats (meat_type) VALUES ('pig');
INSERT INTO Fish (fish_type) VALUES ('squid');

select * from Meats

select * from Fish

话虽如此，在 MS SQL 中跨多个表的标识字段是可能的。

Actually, Joe Celko suggests on this blog to use a custom sequence in your database, and then, for any primary key of your desired tables, specify their default values to get the next number from your custom sequence.

Here is a code sample from his blog:

CREATE SEQUENCE Service_Ticket_Seq
 AS INTEGER
 START WITH 1
 INCREMENT BY 1
 MINVALUE 1
 MAXVALUE 100
 CYCLE;

CREATE TABLE Meats
(ticket_seq INTEGER DEFAULT NEXT VALUE FOR Service_Ticket_Seq
       PRIMARY KEY,
 meat_type VARCHAR(15) NOT NULL);

CREATE TABLE Fish
(ticket_seq INTEGER DEFAULT NEXT VALUE FOR Service_Ticket_Seq
       PRIMARY KEY,
 fish_type VARCHAR(15) NOT NULL);

INSERT INTO Meats (meat_type) VALUES ('pig');
INSERT INTO Fish (fish_type) VALUES ('squid');

select * from Meats

select * from Fish

This being said, an identity field spanning multiple tables is possible in MS SQL.

回复收藏 0 原文

不交电费瞎发啥光 2024-08-31 19:02:45

简短的回答是：不，你不能这样做（至少在 2008 年的 MS SQL Server 中）。

您可以创建一个新表“CommentableEntity”，将您的身份列插入其中，然后在 Authors 和 Books 中定义外键以将其作为父表引用，然后执行多种技巧之一来确保给定的 ID 值没有分配给两个表...但这是一个糟糕的主意，因为您构建的数据模型意味着作者和书籍是相关类型的数据，但实际上并非如此。

您可以有一个单独的表 Comments，其中包含标识列，并在 Authors 和 Books 中放置 CommentId 列。然而，这将限制每本书和作者只能发表一条评论。

我可能会在评论表中添加一个像“CommentorType”这样的列，并在其中放置一个标志来指示评论来源（“A”代表作者，“B”代表书籍）。在“CommentorId + CommentorType”上构建一个主键，它应该工作得足够好——随着系统的扩展，添加更多类型的评论者将是微不足道的。

回复收藏 0 原文

追星践月 2024-08-31 19:02:45

作为建议 - 尝试使用像 ComentId、EntityId、isBook、Comment 这样的表来发表评论。 isBook 是布尔类型，没有太多地方可以获取。从关系的角度来看，你的概念不好。

回复收藏 0 原文

潇烟暮雨 2024-08-31 19:02:45

SQL Server 不支持此功能。您可以使用 id 表创建自己的表，但这会比其价值更多的工作。

我建议您的评论表如下所示：

comment_id int identity
comment_type tinyint
entity_id int

comment_type 指定评论是否属于一本书、一个作者或您将来添加的其他内容。 entity_id 是书籍、作者等的 ID。在此方案中，书籍或作者 ID 是否重叠并不重要。

或者，如果您可以切换到 oracle，请使用序列:)

SQL server does not support this. You could roll your own with an id table, but that would be more work than it is worth.

I suggest your comment table look like this:

comment_id int identity
comment_type tinyint
entity_id int

comment_type specifies if the comment belongs to a book, an author, or something else you add in the future. entity_id is the id of the book, author, whatever. In this scheme, it doesn't matter if book or author ids overlap.

Or, if you can switch to oracle, use a sequence :)

回复收藏 0 原文

~没有更多了~