关于如何在关系模型中处理层次树有什么技巧吗?

发布于 2024-08-25 06:52:09 字数 321 浏览 4 评论 0原文

我有一个树结构,可以是 n 层深,没有限制。这意味着每个节点可以有另外 n 个节点。

在不向数据库发出数千次查询的情况下检索这样的树的最佳方法是什么?

我看了一些其他的模型,比如平表模型、先序树遍历算法等等。

你们对如何实现高效的树模型有什么建议或建议吗?我真正的目标是进行一两个查询,为我吐出整棵树。

通过足够的处理,我可以在点网中显示树,但这将在客户端计算机中显示,所以,没什么大不了的。

哦,只是在这里添加一些更多信息:

环境:Oracle 或 PostgreSQL、C# 和 Winforms。

感谢您的关注

I have a tree structure that can be n-levels deep, without restriction. That means that each node can have another n nodes.

What is the best way to retrieve a tree like that without issuing thousands of queries to the database?

I looked at a few other models, like flat table model, Preorder Tree Traversal Algorithm, and so.

Do you guys have any tips or suggestions of how to implement a efficient tree model? My objective in the real end is to have one or two queries that would spit the whole tree for me.

With enough processing i can display the tree in dot net, but that would be in client machine, so, not much of a big deal.

Oh just dropping some more info here:

environment: Oracle or PostgreSQL, C# and Winforms.

Thanks for the attention

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

夜夜流光相皎洁 2024-09-01 06:52:09

不存在有效的树模型。

SQL 2008 - 层次结构ID。有一种用于处理层次结构的新数据类型,但随着时间的推移它会变得越来越大。

或:平常。表中的parent字段,指向父表的ID。

对于查询...

  • 您可以使用更好的 SQL(每个级别一个语句,加上使用临时表)来实现更优化。这是真正棘手的部分。
  • 我们在 CMS 中所做的就是每个节点都知道其父节点、顶部节点(对于多个图)和下一个更高的容器节点。一些节点被标记为容器 - 它们确实包含子项目,但这些子项目在逻辑上属于另一个容器(例如:网站,带有文件夹,然后带有图像等资源的文档)。

There is no efficient tree model.

SQL 2008 - hierarchyid. There is a new data type for handling hiearchies, but it gets large over time.

Or: usual. Parent field in table, pointing to the ID of the parent table.

For queries...

  • You can do that a little more optimal with better SQL (one statement PER LEVEL, plus use of a temp table). This is the really tricky part here.
  • What we did in a CMS where we had the same was that every node knew its parent, the top node (for multiple graphs) and the next higher container node. Some nodes were marked as containers - they did contain sub-items, but these belonged logically to anothe containe (like: website, with folders, then document with resources like images).
杀お生予夺 2024-09-01 06:52:09

我注意到您将数据库列为 Oracle 或 PostgreSQL。如果您能坚持使用 Oracle,则可以使用 start withconnect by 命令轻松完成您需要的操作。如果存在循环,或者如果您需要根据某些条件过滤掉分支,有很多选项也可以优雅地处理。我个人在一个生产系统中使用过它,该系统有大量的读写操作,没有任何性能问题。

快速搜索表明您可以使用 with recursive 命令对 postgreSQL 执行类似的操作(尽管在 sql 方面更复杂)。我个人没有使用过这种方法,所以我无法向您提供更多信息。

I noticed that you listed your DBs as Oracle or PostgreSQL. If you can stick to Oracle, you can use the start with and connect by commands to easily do what you need. There are plenty of options that will also gracefully handle if there are loops, or if you need to filter out a branch based on some criteria. I've personally used this in a production system that had both heavy reads and writes without any performance issues.

A quick search shows that you can do something similar (although more complex sql wise) with postgreSQL using the with recursive command. I've not personally used this method, so I can't give you any more information then that.

寄离 2024-09-01 06:52:09

嵌套集 - 更新速度慢,但查询速度快。

Nested Sets - slow for updating but fast for querying.

心不设防 2024-09-01 06:52:09

我们使用了一个非常成功的模型,其中我们为每个节点存储了一个字符串,其中包含从顶部到该节点的每个节点 ID,并用“.”分隔。检索树变得非常高效,只需对字符串进行排序。

当然,该模型对于层次结构的深度有限制。但这主要受到 id 字符串列的最大大小的限制。在使用 varchar(max) 的 SQL Server 中,限制为 2^31-1 字节,这使得层次结构相当深。

We used a model with great success where we for each node stored a string containing each node id from top to that node, separated by '.'. Retrieving a tree becomes super efficient, only sort on the string.

This model have a limitation of course as to how deep the hierarchy can go. But this is primarily limited by the max size of the id string column. In SQL Server using the varchar(max) the limit is 2^31-1 bytes which makes for pretty deep hierarchies.

风尘浪孓 2024-09-01 06:52:09

对我来说,这取决于树的大小和要运行的查询的类型。

据我所知,您有两个数据项。我的 ID 和我的父母。如果您创建了一个仅包含这两件事的简单表格,您可以询问并回答谁是我的孩子,谁是我的父母。

如果你想构建一个完整的树进行深入分析,我会查询所有数据,然后他们自己构建树。此时数据库仅充当信息存储的作用。

如果我的树很大或者创建时间超过半秒,我会在我的服务器上维护它,并通过对客户端的 ajax 调用来访问它。如果树对于所有客户端都是静态的,则此方法效果最佳。

如果树是动态的,我会坚持要求客户端等待,同时从服务器检索数据并在本地构建树。这将提高使用速度。

如果您在说“分裂树”时提供更多有关您所指内容的信息,我也许可以提供更好的意见。但总的来说我还没有在数据库中找到理想的树。然而OLAP可能有这样一个我不知道的工具。

To me this depends on the size of your tree and the type of a query you want to run.

From what I can tell you have two data items. MyId and MyParent. If you created a simply table with just those two things you could ask and answer who are my children and who are my parents.

If you wanted to build a complete tree for intensive analysis, I would query all data and them build the tree myself. The database acting only as information storage at this point.

If my tree was large or took more than a half a second to create I would maintain it on my server and make it accessible with ajax calls to the client. This works best if the tree is static for all clients.

If the tree is dynamic I would insist that the client waits while the data is retrieved from the server and the tree is built locally. This will increase usage speed.

If you provided more information on what you ment when you said "split the tree" I might be able to offer a better opinion. But in general I have not found an ideal tree in a database. However OLAP might have such a tool that I don't know about.

天暗了我发光 2024-09-01 06:52:09

假设您唯一关心的是选择而不是插入/更新/删除,这取决于需求,例如:

  • 您是否需要知道每个节点处于什么级别?
  • 您是否需要知道每个节点在渲染时是否有子节点?
  • 您需要按名字对兄弟姐妹进行排序吗?

但是,如果确实对正在完成的树进行了最小的更改,那么使用什么方案几乎并不重要,因为您可以在应用程序层中完成所有工作并缓存输出。

编辑:

当顺序很重要时,我通常会选择具体化路径 方法,并添加一个附加列 SortPath。通过这种方式,您可以按同级排序结果,这是一种非规范化,使得在 HTML 中渲染树变得非常容易,因为您可以使用单个查询完全按照接收记录的顺序写出整个树(或任何部分) 。这对于速度来说是最佳的,并且是一次对多个级别进行排序的最简单方法。

例如,

CREATE TABLE [dbo].[MatPath](
    [ID] [int] NULL,
    [Name] [varchar](50) NULL,
    [Path] [varchar](max) NULL,
    [SortPath] [varchar](max) NULL
) 

insert into MatPath (ID, Name, Path, SortPath) values (1, 'Animal', '1', 'Animal-1')
insert into MatPath (ID, Name, Path, SortPath) values (2, 'Dog', '1.2', 'Animal-1|Dog-2')
insert into MatPath (ID, Name, Path, SortPath) values (3, 'Horse', '1.3', 'Animal-1|Horse-3')
insert into MatPath (ID, Name, Path, SortPath) values (4, 'Beagle', '1.2.4', 'Animal-1|Dog-2|Beagle-4')
insert into MatPath (ID, Name, Path, SortPath) values (5, 'Abyssinian', '1.3.5', 'Animal-1|Horse-3|Abyssinian-5')
insert into MatPath (ID, Name, Path, SortPath) values (6, 'Collie', '1.2.6', 'Animal-1|Dog-2|Collie-6')

select * 
from MatPath
order by SortPath

输出:

ID     Name            Path        SortPath
------ --------------- ----------- --------------------------------
1      Animal          1           Animal-1
2      Dog             1.2         Animal-1|Dog-2
4      Beagle          1.2.4       Animal-1|Dog-2|Beagle-4
6      Collie          1.2.6       Animal-1|Dog-2|Collie-6
3      Horse           1.3         Animal-1|Horse-3
5      Abyssinian      1.3.5       Animal-1|Horse-3|Abyssinian-5

(6 row(s) affected)

您可以通过计算应用程序层中的管道(或句点)或在 SQL 中使用任何内置或自定义函数来确定每个节点的级别。可以 计算字符串的出现次数

另外,您还会注意到,在构建 SortPath 时,我将 ID 附加到 Name 中。这是为了确保具有相同名称的两个兄弟节点始终以相同的顺序返回。

Assuming your only concern is selects and not inserts/updates/deletes, this depends on needs, such as:

  • do you need to know what level each node is at?
  • do you need to know if each node has any children while rendering it?
  • do you need to sort siblings by name?

However, if there really are minimal changes to the tree being done, then it almost doesn't matter what scheme you use, as you can do all the work in the application layer and cache the output.

Edit:

When order matters, I usually go for the materialized path method, and add an additional column SortPath. This way you can get your results sorted by sibling, which is a denormalization that makes rendering the tree in HTML extremely easy, as you can write out the entire tree (or any portion) in exactly the order you receive the records using a single query. This is optimal for speed, and is the easiest way to sort more than one level at a time.

E.g.,

CREATE TABLE [dbo].[MatPath](
    [ID] [int] NULL,
    [Name] [varchar](50) NULL,
    [Path] [varchar](max) NULL,
    [SortPath] [varchar](max) NULL
) 

insert into MatPath (ID, Name, Path, SortPath) values (1, 'Animal', '1', 'Animal-1')
insert into MatPath (ID, Name, Path, SortPath) values (2, 'Dog', '1.2', 'Animal-1|Dog-2')
insert into MatPath (ID, Name, Path, SortPath) values (3, 'Horse', '1.3', 'Animal-1|Horse-3')
insert into MatPath (ID, Name, Path, SortPath) values (4, 'Beagle', '1.2.4', 'Animal-1|Dog-2|Beagle-4')
insert into MatPath (ID, Name, Path, SortPath) values (5, 'Abyssinian', '1.3.5', 'Animal-1|Horse-3|Abyssinian-5')
insert into MatPath (ID, Name, Path, SortPath) values (6, 'Collie', '1.2.6', 'Animal-1|Dog-2|Collie-6')

select * 
from MatPath
order by SortPath

Output:

ID     Name            Path        SortPath
------ --------------- ----------- --------------------------------
1      Animal          1           Animal-1
2      Dog             1.2         Animal-1|Dog-2
4      Beagle          1.2.4       Animal-1|Dog-2|Beagle-4
6      Collie          1.2.6       Animal-1|Dog-2|Collie-6
3      Horse           1.3         Animal-1|Horse-3
5      Abyssinian      1.3.5       Animal-1|Horse-3|Abyssinian-5

(6 row(s) affected)

You can determine the level of each node by counting pipes (or periods) in the application layer, or in SQL using whatever built-int or custom functions that can count occurrences of a string.

Also, you'll notice I append the ID to the Name when building SortPath. This is to ensure that two sibling nodes with the same name will always get returned in the same order.

花之痕靓丽 2024-09-01 06:52:09

您可以在构建树时对节点从 1..M 进行编号,并根据这些索引存储子引用,现在只需编写树即可。您可以在一次读取中检索所有节点。编号方案包含树信息,

You could number the nodes from 1..M as you build the tree, and store children references in terms of these indexes, Now just write the tree. You can retrieve all the nodes in a a single read. The numbering scheme contains the tree information,

一场春暖 2024-09-01 06:52:09

Joe Celko 在他的 SQL for Smarties 书中(第 29 章)对此提供了解决方案。插入、更新和删除有点复杂,但选择速度非常快。我已经使用它几年了,效果非常好。

Joe Celko has a solution for this in his SQL for Smarties book (chapter 29). Inserts, updates, and deletes are a little complicated, but selects are very fast. I have been using it for a few years now and it works very well.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文