棘手的关系数据库设计问题
我有一个棘手的问题,我已经困扰了几天,但找不到最佳解决方案。
这些是我的表:
- site
- site_node
- page
站点节点表包含表示层次结构的节点列表(使用嵌套集)。 每个节点必须有一个或多个关联的页面。 每个站点必须有一个关联的错误页面和一个未找到页面。
因此,页面必须要么属于节点,要么属于错误或未找到页面的站点。 我目前正在尝试的解决方案是:
- 页表上的parentType和parentId字段,其中类型可以是“node”,“site_error”或“site_notFound”,id将是站点或节点id(以与方式)。
- 页表上的nodeId字段可以为null,然后是站点表上的errorPageId和notFoundPageId字段。
选项 #1 确保每一页都属于一个且仅一个其他实体,尽管这种关系实际上无法强制执行,因为 ParentId 字段可以指向多个位置。
选项 #2 更清晰,但它基本上是说该网站“属于”两个错误页面和未找到页面,这可能是不好的做法。
有什么想法或建议吗?
谢谢,
杰克
I have a tricky problem that I've been messing about with for a few days now and cant find an optimal solution for.
These are my tables:
- site
- site_node
- page
The site node table contains a list of nodes representing a hierarchy (using nested set). Each node must have one or more associated pages. Each site must have one associated error page and one not found page.
So, a page must either belong to a node, or a site as an error or not found page. The solutions I'm currently toying with are:
- parentType and parentId fields on the page table, where type would either be "node", "site_error" or "site_notFound" and the id would be the site or node id (whichever is relevant to the type).
- nodeId field on the page table that can be null, and then errorPageId and notFoundPageId fields on the site table.
Option #1 ensures that each page belongs to one and only one other entity, although the relationship cant actually be enforced as the parentId field can point to more than one place.
Option #2 is cleaner, but it's basically saying that the site "belongs" to the two error and not found pages, and that's probably bad practice.
Any thoughts or suggestions?
Thanks,
Jack
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
为错误或未找到的页面创建虚拟站点节点。 您可以根据第一个选项将它们标记为特定类型的节点。 这将使创建通用处理程序机制变得更加容易。 它还将使连接变得更简单,这将有助于提高数据库查询性能。 此外,它允许您添加更多类型的“特殊”页面(可能是登录屏幕)或使其可配置,而无需修改数据库架构。
Make dummy site nodes for the error or not found page. You can mark them as a specific type of node as per your first option. This will make it easier to make a generic handler mechanism. It will also make joins simpler which will help with database query performance. Additionally, it allows you to add more types of 'special' pages (perhaps a login screen) or make this configurable without having to modify your database schema.
正确的。 就关系理论而言,问题在于您的“
parentId
”列违反了第三条正常形式,因为它的含义根据parentType
(非键列)中的值每行而变化。如果单列每行可能包含某人的电话号码或他们的生日(具体取决于其他一些标志),那么您就不会拥有正确设计的数据库。 这是关于这个人的两个不同的事实,他们每个人都应该有自己的专栏。 同样,将 site_id 或 node_id 存储在单个列中也会出现相同的问题。
表明这是一个有缺陷的设计的另一个线索是,您无法声明外键约束来指向两个引用表中的任何一个。
我明白你为什么这么说,因为属于类 Rails 框架中的约定。 但这些都是约定俗成的。 它们不一定是外键可以建模的唯一关系。 您可以使一个实体以有一个关系精确引用另一个实体。 在这种情况下,外键会反转方向。
我想说的是,从逻辑上讲,错误页面和未找到页面属于该网站,而不是相反。 使它们成为强制性的方法是让另一个实体引用这些页面,并对这些引用应用
NOT NULL
约束。 这就是你所描述的。这满足了您的迫切需要,它是可执行的,并且是正常形式。
@NXC 建议为错误和未找到页面创建虚拟节点。 尽管这允许将这些节点存储在节点层次结构中,但它无法强制站点必须具有这些页面。 也就是说,可以在不引用这些节点的情况下存储站点。
@Tony Andrews 建议在每个页面中存储两列,
site_id
和site_node_id
,并添加 CHECK 约束以确保其中一个恰好为非 NULL。 这似乎比parent_id
/parent_type
选项更好,但它仍然没有强制要求每个站点都必须有一个错误和未找到页面。Right. In terms of relational theory, the problem is that your "
parentId
" column violates third normal form, because its meaning varies per row, based on the value inparentType
(a non-key column).You wouldn't have a properly designed database if a single column could contains someone's phone number or their birthdate, per row, depending on some other flag. Those are two different facts about the person, and they each deserve their own column. Likewise, storing both site_id or node_id in a single column would have the same problem.
Another clue that this is a flawed design is that you can't declare a foreign key constraint to point to either of two referenced tables.
I see why you're saying that, because of the belongs to conventions in Rails-like frameworks. But these are conventions; they aren't necessarily the only relationship that foreign keys can model. You can make one entity refer to exactly one other entity, in a has one relationship. In this case, the foreign key reverses direction.
I would say it's logically true that the Error page and the Not Found page belong to the site, not the other way around. And the way to make them mandatory is to have another entity reference these pages, and apply the
NOT NULL
constraint to these references. This is what you've described.This meets your immediate need, it's enforceable, and it's in Normal Form.
@NXC suggests making dummy nodes for Error and Not Found pages. Although this allows these nodes to be stored in the node hierarchy, it fails to enforce that a site must have these pages. That is, a site could be stored without references to these nodes.
@Tony Andrews suggests storing two columns in each page,
site_id
andsite_node_id
, and adding a CHECK constraint to ensure that exactly one of these is non-NULL. This seems better than theparent_id
/parent_type
option, but it still doesn't offer any enforcement that every site must have an Error and a Not Found page.另一种选择是有 2 列 site_id 和 site_node_id,如下所示:
现在,您可以使用引用完整性来确保每个页面属于一个站点或一个节点,而不是同时属于两者。
Another option is to have 2 columns site_id and site_node_id like this:
Now you can use referential integrity to ensure that every page belongs to either a site or a node, and not both.
选项 2 更有意义,并且在以后出现更多并发症时可以挽救您的大脑。 站点与错误/未找到页面的一对一关系使其非常适合外键约束。
Option 2 makes much more sense and will save your brain later down the line if and when more complications spring up. The one-to-one relationship of site to error/notfound page makes it perfect for a foreign key constraint.
选项 1 的修改。
包括两个单独的列:ParentNodeID 和 ParentSiteID。 根据情况,将这两列之一保留为 NULL。 现在您仍然可以为每个外键声明一个外键(引用)约束。
我不太明白 SiteNotFound 的情况。 在这种情况下,您可以将两个外键都保留为 NULL 吗?
您的加入和搜索将更加简单。 您还将遵守 1NF。 这并非巧合。
您的选项 1 将来自不同域的值组合到一个字段中。 这是糟糕的现场设计,IIRC 违反了 1NF。
A modification of Option 1.
Include two separate columns, ParentNodeID and ParentSiteID. Leave one of these two columns NULL, depending on the case. Now you can still declare a foreign key (references) constraint for each foreign key.
I don't really understand the SiteNotFound case. Could you leave both foreign keys NULL in this case?
Your joins and searches will be simpler. You will also be adhering to 1NF. This is not a coincidence.
Your option 1 combines values drawn from different domains in a single field. This is bad field design and IIRC violates 1NF.