如何在数据库中表示树状结构
我正在启动一个项目,并且正处于设计阶段:即,我还没有决定要使用哪个数据库框架。我将拥有创建“森林”状结构的代码。也就是很多棵树,其中每棵树都是一个标准:节点和边。代码创建这些树后,我想将它们保存在数据库中。 (然后最终将它们拉出来)
表示数据库中数据的简单方法是具有两个表的关系数据库:节点和边。也就是说,节点表将具有节点id、节点数据等。边表将是节点id到节点id的映射。
有更好的方法吗?或者考虑到我给出的(有限的)假设,这是最好的方法?如果我们添加一个假设,即树相对较小——将整个树保存为数据库中的 blob 是否更好?在这种情况下我应该使用哪种类型的数据库?请评论速度/可扩展性。
谢谢
I'm starting a project and I'm in the designing phase: I.e., I haven't decided yet on which db framework I'm going to use. I'm going to have code that creates a "forest" like structure. That is, many trees, where each tree is a standard: nodes and edges. After the code creates these trees I want to save them in the db. (and then pull them out eventually)
The naive approach to representing the data in the db is a relational db with two tables: nodes and edges. That is, the nodes table will have a node id, node data, etc.. And the edges table will be a mapping of node id to node id.
Is there a better approach? Or given the (limited) assumptions I'm giving this is the best approach? How about if we add an assumption that the trees are relatively small - is it better to save the whole tree as a blob in the db? Which type of db should I use in that case? Please comment on speed/scalability.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我展示了一个与您的节点类似的解决方案边表,在我对 StackOverflow 问题的回答中:将平面表解析为树的最有效/优雅的方法是什么? 我将此解决方案称为“闭包表”。
我做了一个关于在 SQL 中存储和使用树的不同方法的演示,分层模型使用 SQL 和 PHP 进行数据。我证明,使用正确的索引(取决于您需要运行的查询),闭包表设计可以具有非常好的性能,即使是在大型边集合上(在我的演示中大约有 500K 边)。
我还在我的书中介绍了该设计,SQL 反模式卷 1:避免陷阱数据库编程。
I showed a solution similar to your nodes & edges tables, in my answer to the StackOverflow question: What is the most efficient/elegant way to parse a flat table into a tree? I call this solution "Closure Table".
I did a presentation on different methods of storing and using trees in SQL, Models for Hierarchical Data with SQL and PHP. I demonstrated that with the right indexes (depending on the queries you need to run), the Closure Table design can have very good performance, even over large collections of edges (about 500K edges in my demo).
I also covered the design in my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
请务必对正在树化的实体使用某种低级编码以防止循环。该实体可能是零件、主题、文件夹等。
使用实体文件和实体外部参照文件,您可以循环访问两个文件之间的两种关系(父关系和子关系)之一。
级别是树中找到的实体的级别。实体的低级代码是在任何位置的任何树中找到的实体的最低级别。检查以确保要创建子实体的低级代码小于或等于以防止循环。将实体添加为子实体后,它将至少降低一级。
Be sure to use some sort of low level-coding for the entity being treed to prevent looping. The entity might be a part, subject, folder, etc.
With an Entity file and and Entity-Xref file you can loop through one of say two relationships between the two files, a parent and a child relation.
A level is the level an entity found in a tree. A low-level-code for the entity is the lowest level an entity is found in any tree anywhere. Check to make sure the low level code of the entity you want to make a child is less than or equal to prevent a loop. after adding an entity as a child it will become at least one level lower.