SQL - 嵌套集模型 - 如何找到每个分支的最深父节点(不是叶节点)

发布于 2024-10-17 00:33:19 字数 666 浏览 3 评论 0原文

给定这样一棵树:

A----B---------C----D
     |         |
     E----F    G
     |
     H

我需要找到 C 和 E(每个唯一分支的两个最深的节点;AC 和 AE)

我们的数据库使用嵌套集模型,以及邻接表作为备份以维护树结构,以防万一左右值不同步。

我可以排除所有叶节点( rgt = lft + 1 ) 和根节点( lft = 1 ) 这让我选择了 BE C。正如你可以想象的那样,这是一个大大简化的示例,我们的一些树有超过 100 个节点。如何消除数据中的这些噪音?

这是上面示例的数据(如果它存储在我们的数据库中)。

 node | parent | lft | rgt |
------+--------+-----+-----+
   A  |  NULL  |    1|   16|
   B  |    A   |    2|   15|
   E  |    B   |    3|    8|
   F  |    E   |    4|    5|
   H  |    E   |    6|    7|
   C  |    B   |    9|   14|
   D  |    C   |   10|   11|
   G  |    C   |   12|   13|  

感谢您的帮助!

Given a tree like this:

A----B---------C----D
     |         |
     E----F    G
     |
     H

I need to find C and E (the two deepest nodes of each unique branch; A-C and A-E)

Our database uses the nested set model, as well as the adjacency list as a backup to maintain tree structure in case the left and right values get out of synch.

I can exclude all leaf nodes ( rgt = lft + 1 )
and the root node ( lft = 1 )
which leaves me with B E C. As you can imagine this is a greatly simplified example, some of our trees have over 100 nodes. How can I get rid of this noise in my data?

Here's the data for the example above if it were stored in our database.

 node | parent | lft | rgt |
------+--------+-----+-----+
   A  |  NULL  |    1|   16|
   B  |    A   |    2|   15|
   E  |    B   |    3|    8|
   F  |    E   |    4|    5|
   H  |    E   |    6|    7|
   C  |    B   |    9|   14|
   D  |    C   |   10|   11|
   G  |    C   |   12|   13|  

Thanks for your help!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

不念旧人 2024-10-24 00:33:19

你是对的,第一步是通过叶子节点的属性 right 等于 left + 1 来识别叶子节点。下一步是找到这些叶子节点的所有父节点:这些叶子节点的 left 小于叶子,而 right 大于叶子的那个。最后一步是排除除左值最大(即最接近叶节点)之外的所有父节点。

T-SQL 中的一个示例,其中 l 是叶节点,p1 是我们要查找的直接父节点,p2 用于清除所有非直接父节点。

首先设置一个包含示例数据的测试表:

if object_id('tempdb..#nsm') is not null
    drop table #nsm;

select v.node, v.parent, v.lft, v.rgt
into #nsm
from (
    values 
       ('A'  ,     NULL,     1,   16),
       ('B'  ,    'A'   ,    2,   15),
       ('E'  ,    'B'   ,    3,    8),
       ('F'  ,    'E'   ,    4,    5),
       ('H'  ,    'E'   ,    6,    7),
       ('C'  ,    'B'   ,    9,   14),
       ('D'  ,    'C'   ,   10,   11),
       ('G'  ,    'C'   ,   12,   13)
   ) v (node, parent, lft, rgt)

下面是检索您请求的节点 C 和 E 的查询:

select p1.node, p1.parent, p1.lft, p1.rgt
from #nsm p1
where exists (
            select *
            from #nsm l
            where l.rgt = l.lft + 1
                and p1.lft < l.lft
                and p1.rgt > l.rgt
                and not exists (
                        select *
                        from #nsm p2
                        where p2.lft < l.lft
                            and p2.rgt > l.rgt
                            and p2.lft > p1.lft
                    )
        )

You're right, the first step is to identify the leaf nodes through their property right equals left + 1. Next step is to find all parent nodes for those leaf nodes: these have a left smaller than that of the leaf and right larger than that of the leaf. And the final step is to exclude all parents but the one that has the largest left value (i.e. is closest to the leaf node).

An example in T-SQL, where l is the leaf node, p1 is the direct parent we're looking for and p2 is used to weed out all the non-direct parents.

First set up a test table with your example data in it:

if object_id('tempdb..#nsm') is not null
    drop table #nsm;

select v.node, v.parent, v.lft, v.rgt
into #nsm
from (
    values 
       ('A'  ,     NULL,     1,   16),
       ('B'  ,    'A'   ,    2,   15),
       ('E'  ,    'B'   ,    3,    8),
       ('F'  ,    'E'   ,    4,    5),
       ('H'  ,    'E'   ,    6,    7),
       ('C'  ,    'B'   ,    9,   14),
       ('D'  ,    'C'   ,   10,   11),
       ('G'  ,    'C'   ,   12,   13)
   ) v (node, parent, lft, rgt)

And here's the query that retrieves the both nodes C and E you requested:

select p1.node, p1.parent, p1.lft, p1.rgt
from #nsm p1
where exists (
            select *
            from #nsm l
            where l.rgt = l.lft + 1
                and p1.lft < l.lft
                and p1.rgt > l.rgt
                and not exists (
                        select *
                        from #nsm p2
                        where p2.lft < l.lft
                            and p2.rgt > l.rgt
                            and p2.lft > p1.lft
                    )
        )
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文