数据库建模:Facebook 之类的消息

发布于 2024-10-04 03:36:39 字数 1987 浏览 2 评论 0原文

我正在尝试模仿类似 FB 的东西。基本上,用户可以在用户个人资料的各个部分(例如“墙”、“照片”等)发表评论。我认为以下模型可行:

===========================
wall_message
===========================
- id (PK)
- parent_id (FK)
- wall_owner_profile_id (FK, identify whose wall the message is for)
- poster_profile_id (FK)
- message
- timestamp

===========================
media_message
===========================
- id (PK)
- parent_id (FK)
- media_id (FK, identify which photo, video, etc.)
- poster_profile_id (FK)
- message
- timestamp

parent_id 允许将消息“分组”到相关讨论中。第一条消息的 parent_id 将为 0,后续消息将以 PK 作为 parent_id 值(创建父子关系)。

poster_profile_id 标识发布消息的人。

上面两个表非常相似。将它们组合起来是个好主意吗,例如:

===========================
message
===========================
- id (PK)
- parent_id (FK)
- type (ENUM: "wall", "media", etc.)
- types_id (FK, see explanation below)
- poster_profile_id (FK)
- message
- timestamp

在这种情况下,如果 type 是“wall”,则 types_id 等于第一个表的“ wall_owner_profile_id”。例如,如果 type 为“media”,则 types_id 等于第二个表的 media_id

我有点担心第二种方法需要一列来解释另一列的含义。我认为这样做的缺点是 types_id 没有引用完整性(与“wall_owner_profile_id”和“media_id”不同)。

解决这个问题的最佳方法是什么?

编辑1:

似乎这是迄今为止的解决方案:

===========================
message
===========================
- message_id (PK)
- parent_message_id (FK)
- profile_id (FK, referring to who posted the message)
- message
- subject (applicable only for emails)
- timestamp

===========================
wall_message
===========================
- message_id (FK)
- profile_id (FK, referring to who received the message/owner of wall)

===========================
media_message
===========================
- message_id (FK)
- media_id (FK)

===========================
email_message
===========================
- message_id (FK)
- profile_id (FK, referring to who received the message)

I'm trying to mimic something similar to FB. Basically, users can post comments in various parts of a user's profile (e.g. "wall", a "photo", etc.). I think the following model will work:

===========================
wall_message
===========================
- id (PK)
- parent_id (FK)
- wall_owner_profile_id (FK, identify whose wall the message is for)
- poster_profile_id (FK)
- message
- timestamp

===========================
media_message
===========================
- id (PK)
- parent_id (FK)
- media_id (FK, identify which photo, video, etc.)
- poster_profile_id (FK)
- message
- timestamp

parent_id allows messages to be "grouped" into a related discussion. The first message's parent_id will be 0 and subsequent messages will have the PK as the parent_id value (creating a parent-child relationship).

poster_profile_id identifies who posted the message.

The above two tables are very similar. Would it be a good idea to combine them, such as:

===========================
message
===========================
- id (PK)
- parent_id (FK)
- type (ENUM: "wall", "media", etc.)
- types_id (FK, see explanation below)
- poster_profile_id (FK)
- message
- timestamp

In this case, if, say, type is "wall", then types_id is equal to the first table's "wall_owner_profile_id". If, say, type is "media", then types_id is equal to the second table's media_id.

I'm a bit concerned that the second approach requires a column to explain the meaning of another column. A disadvantage to this, I suppose, is that there would be no referential integrity for types_id (unlike for "wall_owner_profile_id" and "media_id").

What would be the best way to tackle this problem?

EDIT 1:

Seems like this is the solution so far:

===========================
message
===========================
- message_id (PK)
- parent_message_id (FK)
- profile_id (FK, referring to who posted the message)
- message
- subject (applicable only for emails)
- timestamp

===========================
wall_message
===========================
- message_id (FK)
- profile_id (FK, referring to who received the message/owner of wall)

===========================
media_message
===========================
- message_id (FK)
- media_id (FK)

===========================
email_message
===========================
- message_id (FK)
- profile_id (FK, referring to who received the message)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

命比纸薄 2024-10-11 03:36:39

首先,对一些小问题进行回应,以帮助您走上关系数据库和数据库设计的笔直而狭窄的道路。

  1. 整个想法是将尽可能多的规则放在数据库中的一个位置,而不是放在代码中。几乎所有事情都可以通过DDL完成:FK约束; CHECK 约束;和RULES(所有 ISO/IEC/ANSI SQL 要求)。然后所有用户(你的应用程序是一个用户)都可以看到所有规则并更好地理解数据库。无论使用什么客户端来执行代码,这都会保护数据库。数据库供应商(这意味着商业软件,而不是免费软件)实现这些约束比代码更可靠。

  2. 向子表插入行的要求(非约定)是父行必须首先存在。这就是 FK 约束的作用,它确保父行存在。在多对多表中,在插入子表(具有两个 FK,每个父表一个)之前,两个父行都必须存在。

  3. types_id 是一个可怕的想法,因为你破坏了设计规则,并消除了 RI 的可能性。最好使用带有 RI 的单独列(对每个父项的 FK 约束)。 (但是还有更好的方法。)

  4. 所有 Id 列(PK)都应重命名为 TableId。每个都应该有同名的私有数据类型。列名无论存在于何处都将保持不变,作为 FK。唯一的例外是同一个父表有两个 FK:应该是 RoleTableId

解决这个问题的最佳方法是什么?

正常化。您将面临暴露的问题,需要解决这些问题。因此再次归一化。并继续这样做,直到没有问题需要解决为止。

  1. 您的单个​​消息表已经完成了一半。您已经直观地将两个表规范化为一个表。但还有一些问题需要解决,所以让我们来处理它们。

    • Sebastian 提供了两个多对多表,所以我不再重复。
      .
  2. 在您决定这是最终的(因此两个多对多表是最终的)之前,我建议您标准化 WallMedia。对我来说,看起来有很多常见的列。如果将其标准化,您将得到一张表。由于它是由Person为了邀请消息而公开或提供的Thing,并且类型可以是{ Photo |专辑 |邮箱| Wall },我将其称为 PersonFurniturePersonObject

    • 如果最终形成一张表,那么您就不需要两张多对多的表,只需要一张表。

对评论的回复

  1. 绘制模型比输入长篇讨论更容易、更快捷。我已经考虑过你的大部分问题。请检查此内容并就您不明白的任何问题提出具体问题。

链接到社交网络数据模型(第 3 页)

IDEF1X 符号链接

  • 选择您自己的表和列名称
  • Message.Subject 可以设置为 CHAR(0) 或忽略(如果不是 Email)。
  • wall_messageemail_message 相同不是问题,我已将它们规范化到一个表中,
  • 无论它是 wall_message code> 或 email_messagemedia_message 是“发送”到哪里的问题,对吗?您可以通过 CHECK 约束轻松禁止任何消息类型的任何功能(例如分组)。
  • 您还没有回答上面的 (2)
  • 我认为消息分组与媒体分组不同:想想相册上何时有消息列表。
  • 一切都不是问题,建模的整个理念就是,纸张很便宜;关系数据库的整体思想是,尽可能多地使用约束、检查和规则。如果有什么问题我们可以改变它。

(您的种族问题中想要种族(3 个级别)还是 2 个级别?)

First, a few responses to small points, to keep you on the straight and narrow path of Relational databases and db design.

  1. The whole idea is to place as many of the Rules right in the database, in one place, and not in code. Almost everything can be done via DDL: FK constraints; CHECK constraints; and RULES (all ISO/IEC/ANSI SQL requirements). Then all the users (your app is an user) can see all the rules and understand the database better. That protects the db, no matter what client is used to execute the code. Db vendors (that means commercial, not freeware) implementation of these constraints are more reliable than code.

  2. The requirement (not convention) for inserting rows to a child table is that the parent row must exist first. That is what the FK constraint does, it ensures the parent row exists. In a many-to-many table, both parent rows must exist before the child (with two FKs, one to each parent) can be inserted.

  3. types_id is a horrible idea because you have broken design rules, and removed the possiblity of RI. Better to have separate columns with RI (FK constraints to each parent). (But there is an even better way.)

  4. All yourId columns, the PKs, should be renamed TableId. Each should have Private DataType of the same name. The column name is used unchanged wherever it exists, as an FK. The only exception is where you have two FKs to the same parent table: there it should be RoleTableId.

What would be the best way to tackle this problem?

Normalise. And you will have issues that are exposed, which you need to resolve. Therefore Normalise again. And keep doing that until you have no issues to resolve.

  1. Your single Message table is already half way there. You have intuitively Normalised the two tables into one. But there are issues to resolve, so let's handle them.

    • Sebastian has provided the two many-to-many tables, so I won't repeat.
      .
  2. Before you decide that that is final (and therefore the two many-to-many tables are final), I suggest you Normalise Wall and Media. To me, it looks like there are many common columns. If you Normalise that, you will get one table. Since it is a Thing that is exposed or furnished by a Person for the purpose of inviting Messages, and the type can be{ Photo | Album | Mailbox | Wall }, I would call it PersonFurniture or PersonObject.

    • If that ends up as one table, then you won't need two many-to-many tables, just one.

Responses to Comments

  1. It is easier and faster to draw the model, than to type long discussions. I have thought about most of your questions. Please check this and ask specific questions about anything you do not understand.

Link to Social Network Data Model (Page 3)

Link to IDEF1X Notation for those who are unfamiliar with the Relational Modelling Standard.

  • CHoose your own table and column names
  • Message.Subject can be set to CHAR(0) or ignored, if it is not Email.
  • that wall_message and email_message are identical is not a problem, I've Normalised them into one table
  • whether it is a wall_message or email_message or media_message is a matter of where it is "sent", right ? You can easily disallow any function (eg. grouping) for any message type via a CHECK constraint.
  • you haven't answered (2) above
  • I think message grouping is different from media grouping: think about when a photo album has a list of messages on it.
  • nothing is a problem, the whole idea of modelling is, paper is cheap; the whole idea of Relational dbs is, to do as much as possible using constraints, checks, rules. If anything is wrong we can change it.

(Do you want Race (3 levels) or 2 levels in your Ethnicity question ?)

星星的軌跡 2024-10-11 03:36:39

您可以拥有表消息,然后是 n:m 关系表,即

message_to_wall:
- messageID
- wallID

message_to_media:
- messageID
- mediaID

这样您可以保持引用完整性并且只有一个消息表。

当然,从技术上讲,这将允许它将消息发布到墙上和媒体项目(照片等)上。所以你不能轻易限制这一点。

否则 - 如果您并不真正需要关系数据库,您可以考虑使用 NoSQL 数据库,例如 CouchDB 或 MongoDB。您可以将所有这些评论存储在墙上或媒体文档上。这样,您就不需要所有必需的 JOIN 查询,并且评论都链接到媒体或墙。

You could have your table message, and then n:m relationship tables, i.e.

message_to_wall:
- messageID
- wallID

message_to_media:
- messageID
- mediaID

This way you keep the referential integrity and only have one message table.

This of course would technically allow it to have a message posted to a wall AND to a media item (photo, etc.). So you can't easily restrict this.

Otherwise - if you do not really require a relational database, you could think about using a NoSQL database like CouchDB or MongoDB. You can store all those comments right on the wall or media document. That way you don't have all those required JOIN queries and the comments are all linked to the media or wall.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文