正确的数据结构来维持一对多关系

发布于 2024-09-17 16:36:29 字数 670 浏览 2 评论 0原文

我想在数据库中表示文档。有几种不同类型的文档。所有文档都有某些共同点，但并非所有文档都是相同的。

例如，假设我有一个基本的文档表...

TABLE docs (
    ID
    title
    content
)

现在假设我有一个可以属于用户的文档子集，并且可以具有与其关联的附加信息。我可以执行以下操作...

TABLE docs (
    ID
    userID -> users(ID)
    title
    content
    additionalInfo
)

...但是，这将导致表中出现大量空值，因为只有某些文档可以属于用户，而不是全部。因此，我创建了第二个表“ownedDocs”来扩展“docs”：

TABLE ownedDocs (
    docID -> docs(ID)
    userID -> users(ID)
    additionalInfo
)

我想知道：这是正确的方法吗？（我很担心，因为虽然所有内容都在一个表中，但我在文档和用户之间具有一对多关系。但是，通过创建一个新表ownedDocs，数据结构看起来像我在文档之间具有多对多关系和用户 - 这永远不会发生。）

提前感谢您的帮助

原文

I want to represent documents in a database. There are several different types of documents. All documents have certain things in common, but not all documents are the same.

For example, let's say I have a basic table for documents...

TABLE docs (
    ID
    title
    content
)

Now let's say I have a subset of documents that can belong to a user, and that can have additional info associated with them. I could do the following...

TABLE docs (
    ID
    userID -> users(ID)
    title
    content
    additionalInfo
)

...however this will result in a lot of null values in the table, as only some documents can belong to a user, not all. So instead I have created a second table "ownedDocs" to extend "docs":

TABLE ownedDocs (
    docID -> docs(ID)
    userID -> users(ID)
    additionalInfo
)

I am wondering: Is this the right way to do it? (I am worried because while everything is in one table, I have a one-to-many relationship between docs and users. However, by creating a new table ownedDocs, the datastructure looks like I have a many-to-many relationship between docs and users - which will never occur.)

Thanks in advance for your help

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冷弦 2024-09-24 16:36:29

数据结构看起来像是文档和用户之间存在多对多关系 - 这永远不会发生。

明白了，那么您需要在 docs 表中拥有 userid 以确保一对多关系（一个用户，可能有多个文档）。

如果文档未与特定用户关联（如 userid 列为空），我不认为附加信息列为空有什么危害。将附加信息拆分到另一个表仍然意味着一对一的关系，因此您最好使用一个包含文档、用户和附加信息的表......

回复收藏 0 原文

塔塔猫 2024-09-24 16:36:29

“通过创建一个新表ownedDocs，
数据结构看起来像我有一个
文档之间的多对多关系
和用户 - 这永远不会发生。）”

如果您将 OwnedDocs.DocId 作为主键，那么很明显 1:N 关系是不可能的。

零或一对一关系的建模是如果我们只有一个子类型，那么具有 NULL 列的单个表是一种合理的方法，但最好确保仅在适当的情况下填充子类型属性。检查约束来强制执行此规则：

check (userID is not null or AdditionalInfo is null)

或者甚至可能是此规则：

check ( (userID is not null and AdditionalInfo is not null)
        or (userID is null and AdditionalInfo is null) )

属性之间的关系不会显示在 ERD 中（除非您使用命名约定），当然，AdditionalInfo 的强制性质。在第二种情况下，拥有的文档不会很明显。

一旦我们有几个这样的子类型，单独的表的情况就变得引人注目，特别是如果子类型构成一个弧，例如文档可以是 FinancialDocument 或 MedicalDocument 或 PersonnelDocument但不超过一个类别，我曾经使用带有大量空列、视图和检查约束的单个表实现了这样的模型。太可怕了。子类型表绝对是最佳选择。

"by creating a new table ownedDocs,
the datastructure looks like I have a
many-to-many relationship between docs
and users - which will never occur.)"

If you make OwnedDocs.DocId the primary key it will be quite clear that a 1:N relationship is impossible.

The modelling of zero or one to one relationships is tricky. If we have just the one sub-type then the single table with NULL columns is a reasonable approach. However it is good practice to ensure that the sub-types attributes are only populated when appropriate. In the given example that would mean a check constraint to enforce this rule:

check (userID is not null or AdditionalInfo is null)

Or maybe even this rule:

check ( (userID is not null and AdditionalInfo is not null)
        or (userID is null and AdditionalInfo is null) )

The relationship between attributes won't show up in an ERD (unless you use a naming convention). For sure, the mandatory nature of AdditionalInfo for owned documents won't be obvious in the second case.

Once we have several such sub-types the case for separate tables becomes compelling, especially if the sub-types constitute an arc e.g. a Document can be a FinancialDocument or a MedicalDocument or a PersonnelDocument but not more than one category. I once implemented such a model using a single table with lots of null columns, views and check constraints. It was horrible. Sub-type tables are definitely the way to go.

回复收藏 0 原文

简单气质女生网名 2024-09-24 16:36:29

这取决于您想要实现的规范化级别。通常，根据您提供的描述，我会像这样构建我的数据库：

table docs (id, title, content);
table users (id, ...);
table users_docs (doc_id, user_id);
table doc_info(doc_id, additional_info);

如果我错了，有人纠正我，但这应该是第三范式。这使结构保持良好和干净，并且仅使用按预期存储数据所需的位。您可以独立存储所有元素，但在需要时相关联。

根据附加信息的性质，您需要进行一些更改。例如，附加信息是否始终对应于用户？如果文档与用户关联，是否总是会提供它？如果是这样，那么您可以将其添加到 users_docs 表中。但这至少应该向您展示标准化。

It depends on the level of normalization you want to achieve. Typically, based on the description you are providing, I would structure my DB like so:

table docs (id, title, content);
table users (id, ...);
table users_docs (doc_id, user_id);
table doc_info(doc_id, additional_info);

Someone correct me if I am wrong, but this should be 3rd normal form. This keeps the structure nice and clean and uses only the bits required to store the data as expected. You can store all elements independently but related where needed.

Depending on the nature of the additional information, you need to make some changes. For example, will the additional info ALWAYS correspond to a user? Will it always be supplied if the doc is associated to a user? If so, then you can add it to the users_docs table. But this should at least show you the normalization.

回复收藏 0 原文

~没有更多了~