使用nodes()方法在SQL中展平分层XML

发布于 2024-11-03 20:32:43 字数 1312 浏览 1 评论 0 原文

我有一个存储过程,它采用 XML 文档作为参数,其结构类似于以下内容:

<grandparent name="grandpa bob">
  <parent name="papa john">
    <children>
      <child name="mark" />
      <child name="cindy" />
    </children>
  </parent>
  <parent name="papa henry">
    <children>
      <child name="mary" />
    </children>
  </parent>
</grandparent>

我的要求是“展平”此数据,以便可以将其插入到临时表中并在过程中进一步操作,因此上面的内容XML 变为:

Grandparent Name Parent Name     Child Name
---------------- --------------- ---------------
grandpa bob      papa john       mark
grandpa bob      papa john       cindy
grandpa bob      papa henry      mary

目前正在使用 SQL Server XML 节点完成此操作:

SELECT
    VIRT.node.value('../../../@name','varchar(15)') 'Grandparent Name',
    VIRT.node.value('../../@name','varchar(15)') 'Parent Name',
    VIRT.node.value('@name','varchar(15)') 'Child Name'
FROM
    @xmlFamilyTree.nodes('/grandparent/parent/children/child') AS VIRT(node)

这非常有效,直到我开始在过程中抛出大量数据(即 1000 多个 节点),此时这会变成一个停止并需要 1 到 2 分钟来执行。我认为这可能是因为我从最低级别 () 开始,然后遍历每次出现的 XML 文档。将这个单个查询分成 3 个块(每个需要从中获取数据的节点一个)会提高性能吗?鉴于这些节点上都没有我可以用来连接备份的“密钥”,任何人都可以提供任何指示,我如何能够做到这一点?

I have a Stored Procedure that takes an XML document as a parameter similar in a structure to the following:

<grandparent name="grandpa bob">
  <parent name="papa john">
    <children>
      <child name="mark" />
      <child name="cindy" />
    </children>
  </parent>
  <parent name="papa henry">
    <children>
      <child name="mary" />
    </children>
  </parent>
</grandparent>

My requirement is to "flatten" this data so that it can be inserted into a temporary table and manipulated further down the procedure, so the above XML becomes:

Grandparent Name Parent Name     Child Name
---------------- --------------- ---------------
grandpa bob      papa john       mark
grandpa bob      papa john       cindy
grandpa bob      papa henry      mary

This is currently being done using SQL Server XML Nodes:

SELECT
    VIRT.node.value('../../../@name','varchar(15)') 'Grandparent Name',
    VIRT.node.value('../../@name','varchar(15)') 'Parent Name',
    VIRT.node.value('@name','varchar(15)') 'Child Name'
FROM
    @xmlFamilyTree.nodes('/grandparent/parent/children/child') AS VIRT(node)

This works great until I start throwing large amounts of data at the procedure (i.e. 1000+ child nodes), at which point this grinds to a halt and takes between 1 and 2 minutes to execute. I think this may be due to the fact that I am starting off at the lowest level (<child) and then traversing back up the XML document for each occurance. Would splitting this single query into 3 chunks (one per node that I need data from) improve performance here? Given that none of these nodes have "keys" on them that I could use to join back up with, could anyone offer any pointers how I might be able to go about doing this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

南冥有猫 2024-11-10 20:32:43

在网上进行了更多查看后,我似乎已经回答了我自己的问题:

SELECT
    grandparent.gname.value('@name', 'VARCHAR(15)'),
    parent.pname.value('@name', 'VARCHAR(15)'),
    child.cname.value('@name', 'VARCHAR(15)')
FROM
    @xmlFamilyTree.nodes('/grandparent') AS grandparent(gname)
CROSS APPLY
    grandparent.gname.nodes('*') AS parent(pname)
CROSS APPLY
    parent.pname.nodes('children/*') AS child(cname)

使用 CROSS APPLY 我可以选择顶级 grandparent 节点并使用它来选择子 < code>parent 节点等等。使用此方法,我将查询的执行时间从大约 1 分 30 秒减少到大约6 秒

但有趣的是,如果我使用“旧”OPEN XML 方法检索相同的数据,查询将在 1 秒内执行!

看来您可能必须根据传入文档的预期大小/复杂性,根据具体情况来使用这两种技术。

I seem to have answered my own question after a bit more looking around online:

SELECT
    grandparent.gname.value('@name', 'VARCHAR(15)'),
    parent.pname.value('@name', 'VARCHAR(15)'),
    child.cname.value('@name', 'VARCHAR(15)')
FROM
    @xmlFamilyTree.nodes('/grandparent') AS grandparent(gname)
CROSS APPLY
    grandparent.gname.nodes('*') AS parent(pname)
CROSS APPLY
    parent.pname.nodes('children/*') AS child(cname)

Using CROSS APPLY I can select the top-level grandparent node and use this to select the child parent nodes and so on. Using this method I have taken my query from executing in around 1 minute 30 seconds down to around 6 seconds.

Interestingly though, if I use the "old" OPEN XML method to retrieve the same data, the query executes in 1 second!

It seems like you may have to approach the use of these two techniques on a case-by-case basis depending on the expected size/complexity of the document being passed in.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文