SQL 查询：分层合并

发布于 2024-09-15 21:52:05 字数 2592 浏览 7 评论 0原文

我有一个定义层次结构的表：

Create Table [example] (
    id          Integer   Not Null Primary Key,
    parentID    Integer       Null,
    largeData1  nVarChar(max) Null,
    largeData2  nVarChar(max) Null);
    -- largeData3...n also exist

Insert Into [example] (id, parentID, largeData1, largeData2)
Select 1, null, 'blah blah blah', null          Union
Select 2,    1, null,             null          Union
Select 3,    1, 'foo bar foobar', null          Union
Select 4,    3, null,             'lorem ipsum' Union
Select 5,    4, null,             null;

此数据的层次结构图：

层次结构图

我想编写一个将返回的查询任何给定 [id] 值的单行。该行应包含该行的 [id] 和 [parentID] 信息。它还应包含 [largeData1...n] 字段。但是，如果 largeData 字段为空，则它应该向上遍历层次结构，直到遇到该字段的非空值。简而言之，它的功能应该类似于合并函数，除了跨行层次结构而不是一组列。

示例：

哪里 [id] = 1：

id:          1
parentID:    null
largeData1:  blah blah blah
largeData2:  null

哪里 [id] = 2

id:          1
parentID:    1
largeData1:  blah blah blah
largeData2:  null

哪里 [id] = 3

id:          3
parentID:    1
largeData1:  foo bar foobar
largeData2:  null

哪里 [id] = 4

id:          4
parentID:    3
largeData1:  foo bar foobar
largeData2:  lorem ipsum

其中 [id] = 5

id:          5
parentID:    4
largeData1:  foo bar foobar
largeData2:  lorem ipsum

到目前为止，我有这个：

Declare @id Integer; Set @id = 5;

With heirarchy
    (id, parentID, largeData1, largeData2, [level])
As (
    Select id, parentID, largeData1,
           largeData2, 1 As [level]
    From example
    Where id = @id

    Union All

    Select parent.id, parent.parentID,
           parent.largeData1,
           parent.largeData2,
           child.[level] + 1 As [level]
    From example As parent
    Inner Join heirarchy As child
        On parent.id = child.parentID)

Select id, parentID,
   (Select top 1 largeData1
    From heirarchy
    Where largeData1 Is Not Null
    Order By [level] Asc) As largeData1,

   (Select top 1 largeData2
    From heirarchy
    Where largeData2 Is Not Null
    Order By [level] Asc) As largeData2

From example
Where [id] = @id;

这会返回我正在寻找的结果。但是，根据查询计划，它正在为我拉回的每个大型数据字段单独遍历层次结构。

如何提高效率？

这显然是一个更复杂问题的简化版本。最终查询将返回 XML 格式的数据，因此任何涉及 FOR XML 子句的解决方案都完全没问题。

我可以为此创建一个 CLR 聚合函数，如果这样做有帮助的话。我还没有探索过那条路。

原文

I have a table that defines a hierarchy:

Create Table [example] (
    id          Integer   Not Null Primary Key,
    parentID    Integer       Null,
    largeData1  nVarChar(max) Null,
    largeData2  nVarChar(max) Null);
    -- largeData3...n also exist

Insert Into [example] (id, parentID, largeData1, largeData2)
Select 1, null, 'blah blah blah', null          Union
Select 2,    1, null,             null          Union
Select 3,    1, 'foo bar foobar', null          Union
Select 4,    3, null,             'lorem ipsum' Union
Select 5,    4, null,             null;

Hierarchy diagram for this data:

Hierarchy diagram

I want to write a query that will return a single row for any given [id] value. The row should contain that row's [id] and [parentID] information. It should also contain the [largeData1...n] fields. However, if a largeData field is null, then it should traverse up the hierarchy until a non-null value for that field is encountered. It should, in short, function like the coalesce function, except across a hierarchy of rows instead of a set of columns.

Example:

Where [id] = 1:

id:          1
parentID:    null
largeData1:  blah blah blah
largeData2:  null

Where [id] = 2

id:          1
parentID:    1
largeData1:  blah blah blah
largeData2:  null

Where [id] = 3

id:          3
parentID:    1
largeData1:  foo bar foobar
largeData2:  null

Where [id] = 4

id:          4
parentID:    3
largeData1:  foo bar foobar
largeData2:  lorem ipsum

Where [id] = 5

id:          5
parentID:    4
largeData1:  foo bar foobar
largeData2:  lorem ipsum

So far, I have this:

Declare @id Integer; Set @id = 5;

With heirarchy
    (id, parentID, largeData1, largeData2, [level])
As (
    Select id, parentID, largeData1,
           largeData2, 1 As [level]
    From example
    Where id = @id

    Union All

    Select parent.id, parent.parentID,
           parent.largeData1,
           parent.largeData2,
           child.[level] + 1 As [level]
    From example As parent
    Inner Join heirarchy As child
        On parent.id = child.parentID)

Select id, parentID,
   (Select top 1 largeData1
    From heirarchy
    Where largeData1 Is Not Null
    Order By [level] Asc) As largeData1,

   (Select top 1 largeData2
    From heirarchy
    Where largeData2 Is Not Null
    Order By [level] Asc) As largeData2

From example
Where [id] = @id;

This returns the results that I am looking for. However, according to the query plan, it is making a separate pass through the hierarchy for each largeData field that I pull back.

How can I make this more efficient?

This is obviously a simplified version of a more complex problem. The final query will return data in XML format, so any solutions involving the FOR XML clause are perfectly fine.

I can create a CLR aggregate function for this, if doing so would help. I have not yet explored that route.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

过去的过去 2024-09-22 21:52:05

我想出了这个：

DECLARE @Id  int

SET @Id = 5


;WITH cte (Id, ParentId, SaveParentId, LargeData1, LargeData2)
 as (--  The "anchor", your target Id
     select
        ex.Id
       ,ex.ParentId
       ,ex.ParentId  SaveParentId  --  Not changed throughout the CTE
       ,ex.LargeData1
       ,ex.LargeData2
      from Example ex
      where ex.Id = @Id
     union all select
                 cte.Id
                ,ex.ParentId
                ,cte.SaveParentId  --  Not changed throughout the CTE
                 --  These next are only "reset" if they are null and a not-null
                 --  value was found at this level 
                ,isnull(ex.LargeData1, cte.LargeData2)  
                ,isnull(ex.LargeData2, cte.LargeData2)
      from Example ex
       inner join cte
        on cte.ParentId = ex.Id)
 select
   Id
  ,SaveParentId     ParentId
  ,max(LargeData1)  LargeData1
  ,max(LargeData2)  LargeData2
 from cte
 group by Id, SaveParentId

基本上，从目标节点开始，沿着树向上走，如果找到空列，则将其替换为非空值。

（抱歉，我周末不做 XML。）

I came up with this:

DECLARE @Id  int

SET @Id = 5


;WITH cte (Id, ParentId, SaveParentId, LargeData1, LargeData2)
 as (--  The "anchor", your target Id
     select
        ex.Id
       ,ex.ParentId
       ,ex.ParentId  SaveParentId  --  Not changed throughout the CTE
       ,ex.LargeData1
       ,ex.LargeData2
      from Example ex
      where ex.Id = @Id
     union all select
                 cte.Id
                ,ex.ParentId
                ,cte.SaveParentId  --  Not changed throughout the CTE
                 --  These next are only "reset" if they are null and a not-null
                 --  value was found at this level 
                ,isnull(ex.LargeData1, cte.LargeData2)  
                ,isnull(ex.LargeData2, cte.LargeData2)
      from Example ex
       inner join cte
        on cte.ParentId = ex.Id)
 select
   Id
  ,SaveParentId     ParentId
  ,max(LargeData1)  LargeData1
  ,max(LargeData2)  LargeData2
 from cte
 group by Id, SaveParentId

Basically, start at your target node and walk up the tree, replacing your null columns with not-null values if and when they are found.

(Sorry, but I don't do XML on weekends.)

回复收藏 0 原文

~没有更多了~