有趣的树/分层数据结构问题
大学有不同的院系组织方式。有些学校去 School ->术语->部门
。其他的则介于两者之间,最长的是学校 -> 。子校区->程序->术语->分部->部门
。
School
、Term
和 Department
是学校“系树”中唯一始终存在的属性。这些类别的顺序永远不会改变,我给你的第二个例子是最长的。每走一步都是1:N的关系。
现在,我不确定如何设置表之间的关系。例如,Term
中有哪些列?其父级可以是 Program
、Sub_Campus
或 School
。具体是哪一种,要看学校的制度。我可以设想设置 Term
表来为所有这些设置外键(所有这些都默认为 NULL
),但我不确定这是规范的在这里做事的方式。
Colleges have different ways of organizing their departments. Some schools go School -> Term -> Department
. Others have steps in between, with the longest being School -> Sub_Campus -> Program -> Term -> Division -> Department
.
School
, Term
, and Department
are the only ones that always exist in a school's "tree" of departments. The order of these categories never changes, with the second example I gave you being the longest. Every step down is a 1:N relationship.
Now, I'm not sure how to set up the relationships between the tables. For example, what columns are in Term
? Its parent could be a Program
, Sub_Campus
, or School
. Which one it is depends on the school's system. I could conceive of setting up the Term
table to have foreign keys for all of those (which all would default to NULL
), but I'm not sure this is the canonical way of doing things here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我建议您最好使用一个通用表,称为“实体”,其中包含id字段和自引用的parent字段。
每个相关表将包含一个指向实体 ID (1:1) 的字段。在某种程度上,每个表都是实体表的子表。
I suggest you better use a general table, called e.g. Entity which would contain id field and a self-referencing parent field.
Each relevant table would contain a field pointing to Entity's id (1:1). In a way each table would be a child of the Entity table.
这是一种设计可能性:
此选项利用了您的特殊限制。基本上,您通过引入通用节点将所有层次结构概括为最长形式的层次结构。如果学校没有“子校区”,则只需为其分配一个名为“主校区”的通用子校区。例如,
学校 ->术语-> Department
可以被认为与School -> 相同。子校区 = 主校区 ->程序=主->术语->部门=主->部门
。在本例中,当学校没有名为“Main”的节点时,我们将默认分配该节点。现在,您可以为这些通用节点设置一个布尔标志属性,表明它们只是占位符,并且如果需要,该标志将允许您在中间层或 UX 中将其过滤掉。这种设计将允许您照常利用所有关系约束,并简化代码中缺失节点类型的处理。
Here's one design possibility:
This option takes advantage of your special constraints. Basically you generalize all hierarchies as that of the longest form by introducing generic nodes. If school doesn't have "sub campus" then just assign it a generic sub campus called "Main". For example,
School -> Term -> Department
can be thought of same asSchool -> Sub_Campus = Main -> Program=Main -> Term -> Division=Main -> Department
. In this case, we assign a node called "Main" as default when school doesn't have that nodes. Now you can just have a boolean flag property for these generic nodes that indicates that they are just placeholders and this flag would allow you to filter it out in middle layer or in UX if needed.This design will allow you to take advantage of all relational constraints as usual and simplify handling of missing node types in your code.
输出:
顺便说一句:我遗漏了属性。我建议它们可以通过 EAV 类型的数据模型与相关类别挂钩。
The output:
BTW: I left out the attributes. I propose they could be hooked to the relevant categories by means of a EAV type of data model.
我将首先讨论如何实现单个层次模型(仅 1:N 关系)。
让我们使用您的示例
School ->术语->部门
。这是我使用 MySQLWorkbench 生成的代码(我删除了一些内容以使其更清晰):
这是数据模型的 MySQLWorkbench 版本:
如您所见,位于层次结构顶部的
school
只有 < code>school_name 作为其密钥,而department
有一个由三部分组成的密钥,其中包括其所有家长的密钥。此解决方案的要点
UNIQUE
约束)现在是问题的第二部分。
我对问题的解释
有一个分层数据模型。但是,某些应用程序需要所有表,而其他应用程序仅使用部分表,跳过其他表。我们希望能够实现1 个单一数据模型并将其用于这两种情况。
您可以使用上面给出的解决方案,并且正如 ShitalShah 提到的那样,向任何不会使用的表添加默认值。让我们使用上面给出的模型来看一些示例数据,其中我们只想保存
School
和Department
信息(没有Term
):< strong>关键点
school
中的每个值,term
中都有一个默认值——如果您在层次结构深处有一个表,这可能会很烦人应用程序不需要,default
应该采用不同的颜色还有另一种解决方案可以将树存储在数据库中。 Bill Karwin在这里从幻灯片 49 开始讨论了这个问题,但我没有认为这就是您想要的解决方案。卡文的解决方案适用于任何大小的树木,而您的示例似乎相对静态。此外,他的解决方案也有自己的一系列问题(但不是所有问题都如此吗?)。
我希望这对您的问题有所帮助。
I'm going to start by discussing implementing a single hierarchical model (just 1:N relationships) relationally.
Let's use your example
School -> Term -> Department
.Here's code that I generated using MySQLWorkbench (I removed a few things to make it clearer):
Here is the MySQLWorkbench version of the data model:
As you can see,
school
, at the top of the hierarchy, has onlyschool_name
as its key, whereasdepartment
has a three-part key including the keys of all of its parents.Key points of this solution
UNIQUE
constraints on multi-column foreign keys)Now for the second part of your question.
My interpretation of the question
There is a hierarchical data model. However, some applications require all of the tables, whereas others utilize only some of the tables, skipping the others. We want to be able to implement 1 single data model and use it for both of these cases.
You could use the solution given above, and, as ShitalShah mentioned, add a default value to any table which would not be used. Let's see some example data, using the model given above, where we only want to save
School
andDepartment
information (noTerm
s):Key points
term
for every value inschool
-- this could be quite annoying if you had a table deep in the hierarchy that an application didn't needdefault
should be colored differentlyThere is another solution to storing trees in databases. Bill Karwin discusses it here, starting around slide 49, but I don't think this is the solution you want. Karwin's solution is for trees of any size, whereas your examples seem to be relatively static. Also, his solutions come with their own set of problems (but doesn't everything?).
I hope that helps with your question.
对于在关系数据库中拟合分层数据的一般问题,常见的解决方案是邻接列表(像您的示例一样的父子链接)和 嵌套集。正如维基百科文章中所述,Oracle 的 Tropashko 提出了一种替代方案 嵌套间隔解决方案,但它仍然相当模糊。
适合您情况的最佳选择取决于您将如何查询结构以及您正在使用哪个数据库。樱桃采摘文章:
但是:
同样,根据查询结构的方式,您可以选择 NoSQL 样式的非规范化
Department
表,并为所有可能的父项使用可为空
外键,从而完全避免递归查询。For the general problem of fitting hierarchical data in a relational database, the common solutions are adjacency lists (parent-child links like your example) and nested sets. As noted in the wikipedia article, Oracle's Tropashko propsed an alternative nested interval solution but it's still fairly obscure.
The best choice for your situation depends on how you will be querying the structure, and which DB you are using. Cherry picking the article:
However:
Again, depending on how your structure will be queried, you may choose a NoSQL style denormalized
Department
table, withnullable
foreign keys to all possible parents, avoiding recursive queries altogether.我会以一种非常灵活的方式开发它,这似乎也是最简单的:
应该只有一个表,让我们称其为category_nodes:
因此表中的每个记录都有一个唯一的 id、一个父 id 和一个名字。
现在,在前两个插入之后:在category_nodes中,其中category_node_id为0的是根节点(所有节点的父节点,无论距离多少度。第二个只是为了一个小帮助,在category_node_id = 1处设置一个未分类节点,其中也是插入表时parent_id 的默认值
现在想象根类别是学校、学期和部门,您将:
然后获取所有根类别:
现在想象一个更复杂的模式:
现在获取 School 的所有子类别,例如:
等等,由于带有parent_id 的默认值 = 1,插入“未分类”类别变得简单:
干杯。
I would develop this in a very flexible manner and what seems to mean to be the simplest as well:
There should only be one table, lets call it the category_nodes:
So each record in the table has a unique id, a parent id, and a name.
Now after the first 2 inserts: in category_nodes where the category_node_id is 0 is the root node (the parent of all nodes no matter how many degres away. The second is just for a little helper, set an uncategorized node at the category_node_id = 1 which is also the defalt value of parent_id when inserting into the table.
Now imagining the root categories are School, Term, and Dept you would:
Then to get all the root categories:
Now imagining a more complex schema:
Now to get all the subcategories of School for example:
And so on. Thanks to a default = 1 with the parent_id, inserting into the 'uncategorized' category become simple:
Cheers