使用 XSLT 对 XML 进行排序

发布于 2024-11-02 15:01:31 字数 1969 浏览 1 评论 0原文

我有一个描述目录树的 xml。它可以有任意数量的子节点。这是一个例子:

<Catalog name="AccessoriesCatalog">
<Category Definition="AccessoriesCategory" name="1532" id="1532">
</Category>
<Category Definition="AccessoriesCategory" name="16115" id="16115">
    <ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16116" id="16116">
    <ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16126" id="16126">
    <ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16131" id="16131">
    <ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16132" id="16132">
    <ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16136" id="16136">
    <ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16139" id="16139">
    <ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16144" id="16144">
    <ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16195" id="16195">
    <ParentCategory>16131</ParentCategory>
</Category>

我需要能够根据类别名称和父类别对其进行排序。所有父类别应在 xml 中排在第一位,“叶类别”应排在最后。在此示例中,xml 已排序。

当上面的 xml 表示为树时,它看起来像这样
1532
-16115
--16116
--16126
-16131
--16132
--16136
--16139
--16144
--16195

我希望它像这样排序
1532
-16115
-16131
--16116
--16126
--16132
--16136
--16139
--16144
--16195

它可以是多个级别的子元素(在本例中只有一个 3 级树)。我希望所有 1 级元素首先出现在 xml 中,然后是所有 2 级元素,然后是所有 3 级元素等。

I have an xml that describes a catalog tree. It can have any number of child nodes. Here is an example:

<Catalog name="AccessoriesCatalog">
<Category Definition="AccessoriesCategory" name="1532" id="1532">
</Category>
<Category Definition="AccessoriesCategory" name="16115" id="16115">
    <ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16116" id="16116">
    <ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16126" id="16126">
    <ParentCategory>16115</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16131" id="16131">
    <ParentCategory>1532</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16132" id="16132">
    <ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16136" id="16136">
    <ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16139" id="16139">
    <ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16144" id="16144">
    <ParentCategory>16131</ParentCategory>
</Category>
<Category Definition="AccessoriesCategory" name="16195" id="16195">
    <ParentCategory>16131</ParentCategory>
</Category>

I need to be able to sort it on Category name and ParentCategory. All parent categories shall come first in the xml and the "leaf categories" shall come last. In this sample the xml is already sorted.

Above xml looks like this when it is represented as a tree
1532
-16115
--16116
--16126
-16131
--16132
--16136
--16139
--16144
--16195

I want it to be sorted like this
1532
-16115
-16131
--16116
--16126
--16132
--16136
--16139
--16144
--16195

It can be several levels of child elements (in this case only a 3 level tree). I want all level 1 elements to come first in the xml, then all level 2 elements and then all level 3 elements etc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

策马西风 2024-11-09 15:01:31

此转换

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kElemById" match="Category"
  use="@id"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/*">
  <xsl:copy>
   <xsl:copy-of select="@*"/>

   <xsl:call-template name="sortHier">
    <xsl:with-param name="pNodes" select=
    "*[ParentCategory]"/>
    <xsl:with-param name="pParents" select=
    "*[not(ParentCategory)]"/>
   </xsl:call-template>
  </xsl:copy>
 </xsl:template>

 <xsl:template name="sortHier">
  <xsl:param name="pNodes"/>
  <xsl:param name="pParents"/>

  <xsl:apply-templates select=
   "$pParents|$pNodes[not($pParents)]">
   <xsl:sort select="@name"/>
  </xsl:apply-templates>

   <xsl:if test="$pNodes and $pParents">
    <xsl:variable name="vNewParents"
     select="key('kElemById', $pNodes/ParentCategory)
                [not(@id=$pParents/@id)]
     "/>

    <xsl:variable name="vNewChildren"
     select="$pNodes[not(@id=$vNewParents/@id)]"/>

    <xsl:call-template name="sortHier">
     <xsl:with-param name="pNodes"
          select="$vNewChildren"/>
     <xsl:with-param name="pParents"
          select="$vNewParents"/>
    </xsl:call-template>
   </xsl:if>
 </xsl:template>
</xsl:stylesheet>

应用于此 XML 文档时(基于提供的文档,但已打乱/未排序):

<Catalog name="AccessoriesCatalog">
    <Category Definition="AccessoriesCategory"
    name="16144" id="16144">
        <ParentCategory>16131</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16116" id="16116">
        <ParentCategory>16115</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16126" id="16126">
        <ParentCategory>16115</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16131" id="16131">
        <ParentCategory>1532</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16132" id="16132">
        <ParentCategory>16131</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16136" id="16136">
        <ParentCategory>16131</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16139" id="16139">
        <ParentCategory>16131</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16115" id="16115">
        <ParentCategory>1532</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="1532" id="1532"></Category>
    <Category Definition="AccessoriesCategory"
    name="16195" id="16195">
        <ParentCategory>16131</ParentCategory>
    </Category>
</Catalog>

产生所需的正确结果

<Catalog name="AccessoriesCatalog">
   <Category Definition="AccessoriesCategory" name="1532" id="1532"/>
   <Category Definition="AccessoriesCategory" name="16115" id="16115">
      <ParentCategory>1532</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16131" id="16131">
      <ParentCategory>1532</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16116" id="16116">
      <ParentCategory>16115</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16126" id="16126">
      <ParentCategory>16115</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16132" id="16132">
      <ParentCategory>16131</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16136" id="16136">
      <ParentCategory>16131</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16139" id="16139">
      <ParentCategory>16131</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16144" id="16144">
      <ParentCategory>16131</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16195" id="16195">
      <ParentCategory>16131</ParentCategory>
   </Category>
</Catalog>

解释

  1. 递归调用命名模板,有两个参数:“当前父项的集合”(或“最后找到的父项”)和当前父项的集合(仍然是

  2. 停止条件:“当前父节点集”或“当前节点集”或两者都为空。在这里,我们输出(并按@name排序)剩余的非空参数集。

  3. 递归步骤:“当前父母”的直接子级成为新的“当前父母”。其余的“当前节点”成为新的“当前节点”。复制所有当前父节点,如果没有剩余当前父节点,则复制所有当前节点。

更新

OP在评论中声称该解决方案适用于小文件,

“但是当我在整个 xml 上尝试时
有更多的元素和更多的层次
不工作。我的xml是
大约 8Mb,所以我无法将其发布到此处。”

我要求他提供(离线)XML 文件,当我得到它们时,我已经确认这个解决方案在小型和大型(44000 行,700KB)上都没有问题。 )我提供的文件

在更大的文件上的性能并不算太差(除了 MSXML3)

这是 44000 行文件的性能数据,在我的 8 年前(2GB RAM,3GHz 单核)PC 上看到:

MSXML3:                 91 sec.

MSXML6:                  6 sec.

AltovaXML (XMLSpy):      6 sec.

Saxon 6.5.4:             2 sec.

Saxon 9.1.05:            1.6 sec.

XslCompiledTransform     1.3 sec.

XQSharp:                 0.8 sec.

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kElemById" match="Category"
  use="@id"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/*">
  <xsl:copy>
   <xsl:copy-of select="@*"/>

   <xsl:call-template name="sortHier">
    <xsl:with-param name="pNodes" select=
    "*[ParentCategory]"/>
    <xsl:with-param name="pParents" select=
    "*[not(ParentCategory)]"/>
   </xsl:call-template>
  </xsl:copy>
 </xsl:template>

 <xsl:template name="sortHier">
  <xsl:param name="pNodes"/>
  <xsl:param name="pParents"/>

  <xsl:apply-templates select=
   "$pParents|$pNodes[not($pParents)]">
   <xsl:sort select="@name"/>
  </xsl:apply-templates>

   <xsl:if test="$pNodes and $pParents">
    <xsl:variable name="vNewParents"
     select="key('kElemById', $pNodes/ParentCategory)
                [not(@id=$pParents/@id)]
     "/>

    <xsl:variable name="vNewChildren"
     select="$pNodes[not(@id=$vNewParents/@id)]"/>

    <xsl:call-template name="sortHier">
     <xsl:with-param name="pNodes"
          select="$vNewChildren"/>
     <xsl:with-param name="pParents"
          select="$vNewParents"/>
    </xsl:call-template>
   </xsl:if>
 </xsl:template>
</xsl:stylesheet>

when applied on this XML document (based on the provided one, but shuffled/unsorted):

<Catalog name="AccessoriesCatalog">
    <Category Definition="AccessoriesCategory"
    name="16144" id="16144">
        <ParentCategory>16131</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16116" id="16116">
        <ParentCategory>16115</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16126" id="16126">
        <ParentCategory>16115</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16131" id="16131">
        <ParentCategory>1532</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16132" id="16132">
        <ParentCategory>16131</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16136" id="16136">
        <ParentCategory>16131</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16139" id="16139">
        <ParentCategory>16131</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="16115" id="16115">
        <ParentCategory>1532</ParentCategory>
    </Category>
    <Category Definition="AccessoriesCategory"
    name="1532" id="1532"></Category>
    <Category Definition="AccessoriesCategory"
    name="16195" id="16195">
        <ParentCategory>16131</ParentCategory>
    </Category>
</Catalog>

produces the wanted, correct result:

<Catalog name="AccessoriesCatalog">
   <Category Definition="AccessoriesCategory" name="1532" id="1532"/>
   <Category Definition="AccessoriesCategory" name="16115" id="16115">
      <ParentCategory>1532</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16131" id="16131">
      <ParentCategory>1532</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16116" id="16116">
      <ParentCategory>16115</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16126" id="16126">
      <ParentCategory>16115</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16132" id="16132">
      <ParentCategory>16131</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16136" id="16136">
      <ParentCategory>16131</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16139" id="16139">
      <ParentCategory>16131</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16144" id="16144">
      <ParentCategory>16131</ParentCategory>
   </Category>
   <Category Definition="AccessoriesCategory" name="16195" id="16195">
      <ParentCategory>16131</ParentCategory>
   </Category>
</Catalog>

Explanation:

  1. Recursively called named template with two parameters: the "set of current parents" (or "last found parents") and the set of the current (still not processed) nodes.

  2. Stop condition: Either the "set of current parents" or the "set of current nodes" or both are empty. Here we output (and sort by @name) the remaining non-empty parameter-set.

  3. Recursive step: The immediate children of the "current parents" become the new "current parents". The rest of the "current nodes" become the new "current nodes. Copy all the current-parents or all the current-nodes if there are no current-parents left.

Update:

In comments the OP has been claiming that the solution was working on small files,

"But when I try it on the whole xml
with more elements and more levels it
is not working. The xml I have is
about 8Mb so I can't post it here."

I asked him to provide (offline) the XML files and when I got them, I have confirmed that this solution performs without problem on both the small and the bigger (44000 lines, 700KB) files that I was provided with.

The performance on the bigger file wasn't too bad with the exception of MSXML3.

Here is the performance data for the 44000 lines file, as seen on my 8 years old (2GB RAM, 3GHz single core) PC:

MSXML3:                 91 sec.

MSXML6:                  6 sec.

AltovaXML (XMLSpy):      6 sec.

Saxon 6.5.4:             2 sec.

Saxon 9.1.05:            1.6 sec.

XslCompiledTransform     1.3 sec.

XQSharp:                 0.8 sec.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文