如何找到树中最新节点的不同根节点(保存在闭包表中)?
我尝试将消息树存储为 MySQL 上的闭包表。主要是从 Bill Karwin 的演讲分层数据模型中了解了这种方法。问题是:我想找到不同的 3 个根节点(=没有祖先的节点),它们在树中具有最新的节点。注意!即使某个根节点的子树中有 10 个最新节点,它也只会计数一次。
所有节点都有其修改时间,为简单起见,我们可以说节点 id 也代表其上次修改的时间(但我们不能在查询中使用 id 作为时间),第一个是第 1 个,最后一个是第 17 个。
1
2
8
15
16
17
7
3
4
5
6
9
12
10
14
11
13
在闭包表中,我有 3 列(祖先、后代、深度),因此树的呈现方式如下:
| ancestor | descendant | depth |
+----------+------------+-------+
| 1 | 1 | 0 |
| 1 | 2 | 1 |
| 1 | 7 | 1 |
| 1 | 8 | 2 |
| 1 | 15 | 3 |
| 1 | 16 | 3 |
| 1 | 17 | 2 |
| 2 | 2 | 0 |
| 2 | 8 | 1 |
| 2 | 15 | 2 |
| 2 | 16 | 2 |
| 2 | 17 | 1 |
| 3 | 3 | 0 |
| 3 | 4 | 1 |
| 3 | 5 | 2 |
| 3 | 6 | 1 |
| 4 | 4 | 0 |
| 4 | 5 | 1 |
| 5 | 5 | 0 |
| 6 | 6 | 0 |
| 7 | 7 | 0 |
| 8 | 8 | 0 |
| 8 | 15 | 1 |
| 8 | 16 | 1 |
| 9 | 9 | 0 |
| 9 | 12 | 1 |
| 10 | 10 | 0 |
| 10 | 14 | 1 |
| 11 | 11 | 0 |
| 11 | 13 | 1 |
| 12 | 12 | 0 |
| 13 | 13 | 0 |
| 14 | 14 | 0 |
| 15 | 15 | 0 |
| 16 | 16 | 0 |
| 17 | 17 | 0 |
我可以获得这样的最新子树:
SELECT c.ancestor, MAX(time) AS t
FROM closure c
JOIN nodes n ON (c.descendant = n.node AND c.ancestor <> n.node)
GROUP BY c.ancestor ORDER BY t desc;
但是我如何获得具有最新帖子的不同 3 个 root 节点(在本例中为 1、10 和 11)?通过一个查询这可能(并且合理)吗?
编辑:我将示例表放入pastebin
I try to store tree of messages as closure table on MySQL. Mostly learned about this method from Bill Karwin's presentation Models for hierarchical data. Problem is: i want to find distinct 3 root nodes (=nodes without ancestors), which have most recent nodes in their trees. NB! Even if some root-node has 10 newest node in it's subtree, it would count just once.
All nodes have their modification time, for simplicity lets say that node ids are representing also time of their last modification (but we can't use ids in querys as time), first is 1st and last is 17th.
1
2
8
15
16
17
7
3
4
5
6
9
12
10
14
11
13
In closure table i have 3 columns (ancestor, descendant, depth), so tree is presented like that:
| ancestor | descendant | depth |
+----------+------------+-------+
| 1 | 1 | 0 |
| 1 | 2 | 1 |
| 1 | 7 | 1 |
| 1 | 8 | 2 |
| 1 | 15 | 3 |
| 1 | 16 | 3 |
| 1 | 17 | 2 |
| 2 | 2 | 0 |
| 2 | 8 | 1 |
| 2 | 15 | 2 |
| 2 | 16 | 2 |
| 2 | 17 | 1 |
| 3 | 3 | 0 |
| 3 | 4 | 1 |
| 3 | 5 | 2 |
| 3 | 6 | 1 |
| 4 | 4 | 0 |
| 4 | 5 | 1 |
| 5 | 5 | 0 |
| 6 | 6 | 0 |
| 7 | 7 | 0 |
| 8 | 8 | 0 |
| 8 | 15 | 1 |
| 8 | 16 | 1 |
| 9 | 9 | 0 |
| 9 | 12 | 1 |
| 10 | 10 | 0 |
| 10 | 14 | 1 |
| 11 | 11 | 0 |
| 11 | 13 | 1 |
| 12 | 12 | 0 |
| 13 | 13 | 0 |
| 14 | 14 | 0 |
| 15 | 15 | 0 |
| 16 | 16 | 0 |
| 17 | 17 | 0 |
I could get newest subtrees like that:
SELECT c.ancestor, MAX(time) AS t
FROM closure c
JOIN nodes n ON (c.descendant = n.node AND c.ancestor <> n.node)
GROUP BY c.ancestor ORDER BY t desc;
But how could i get distinct 3 root nodes having newest postings (1, 10 and 11 in this case)? Is this possible (and rational) with one query?
Edit: i put sample tables to pastebin
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
该线程相当古老,但我在想出一个解决方案之前偶然发现了它,该解决方案除了标准祖先 & 之外不需要额外的列。后代,你甚至不需要时间,因为OP自己指出了问题:你想要的祖先是没有其他人的后代。下面是最终的查询,下面是测试数据,您可以自己尝试一下。
测试数据脚本来加载此测试层次结构:
This thread is fairly old but I stumbled across it before coming up with a solution that does not require additional columns besides the standard ancestor & descendant, you don't even need the time, because the OP himself states the problem: you want ancestors that are descendants of no other. Below is the final query, and below that is test data to try it yourself.
Test data scripts to load up this test hierarchy:
我得到了某种解决方案。 “有点”,因为我必须在节点表中使用附加列:根。它表示节点是否是根节点。使用这个额外的位,我可以编写这样的查询:
在我看来,它的性能非常好。它也可以扩展。我生成了包含 100000 个节点的树,大约需要 1 秒才能获得结果(最大树深度为 18)。
我附加了用于内容生成的 perl 脚本(和表架构),因此也许有些人可以调整此查询以使其性能更好。
I got kind of solution. "Kind of", because i had to use additional column in nodes-table: root. It says whether node is root-node or not. Using this additional bit i can compose such query:
Seems to me, it performs pretty well. It scales too. I generated tree with 100000 node and it took around 1 sec to get results (max tree depth was 18).
I attach the perl script (and table schema) for content generation, so maybe some could tune this query to perform better.
您可以创建一个顶级元素,该元素仅供参考,其所有后代都将是根节点。
You may create one top level element, which is only for reference and all of its descendants will be the root nodes.
我尝试将其模拟到数据库中,并生成此查询来查找具有最新帖子的最后 3 个根节点。我不太确定我是否理解了您的所有请求,但如果我不理解,请告诉我,我会尽快尽力帮助您。
我的查询如下:
如您所见,同一个查询中有三个查询。
如果它对你有用,请告诉我,我明天会解释它是如何工作的。
此致,
乔迪·马斯
I tried to simulate it into a database, and I generated this query to find the last 3 root nodes having newest postings. I'm not really sure that I'd undesrstood all your requests, but if I don't, please tell me and as soon as possible I'll try to help you.
My query is the following one:
As you can see, there are three queries in the same one.
If it works to you, just tell me, and I will explain how it works tomorrow.
Best regards,
Jordi Mas
这里你有相同的代码,别名中没有引号,请检查它并告诉我它是否适合你。我在 Microsoft SQL Server 下尝试过,因为我的笔记本电脑上没有 MySQL Server,但如果它不起作用,请告诉我,我会安装并尝试它。
查询:
此查询的结果与您的数据如下:
MinAncestor: 1, 10, 11
MaxDescendant: 17, 14, 13
希望对您有所帮助。
在您对 TOP 语句发表评论后(它不适用于 MySQL),最终查询必须是这样的:
Here you have the same code without the quotes in the alias, check it and tell me if it works with you, please. I tried under Microsoft SQL Server, because I don't have MySQL Server on my laptop, but if it doesn't work, tell me and I will install and try it.
Query:
The result of this query, with your data is the following one:
MinAncestor: 1, 10, 11
MaxDescendant: 17, 14, 13
I hope it will help you.
After your comment about TOP statement (it doesn't work on MySQL), the final query had to be this one: