XSLT：合并一组树层次结构

发布于 2024-07-15 22:01:56 字数 2316 浏览 7 评论 0原文

我有一个基于 Excel 在另存为“XML Spreadsheet 2003 (*.xml)”时生成的 XML 文档。

电子表格本身包含一个带有标签层次结构的标题部分：

 | A     B     C     D     E     F     G     H     I
-+-----------------------------------------------------
1| a1                                  a2
2| a11         a12         a13         a21   a22
3| a111  a112  a121  a122  a131  a132        a221  a222

此层次结构存在于工作簿中的所有工作表上，并且到处看起来或多或少相同。

Excel XML 的工作方式与普通 HTML 表格完全相同。（包含的）。我已经能够将所有内容转换为这样的树结构：

<node title="a1" col="1">
  <node title="a11" col="1">
    <node title="a111" col="1"/>
    <node title="a112" col="2"/>
  </node>
  <node title="a12" col="3">
    <node title="a121" col="3" />
    <node title="a122" col="4" />
  </node>
  <!-- and so on -->
</node>

但这里很复杂：

有多个工作表，因此每个工作表都有一棵树，
每个工作表上的层次结构可能略有不同，树不会相等（例如，工作表 2 可能有“a113”，而其他工作表则没有）
树深度没有明确限制，
但是标签在所有工作表中都是相同的，这意味着它们可以用于

分组喜欢将这些单独的树合并成一个如下所示的树：

<node title="a1">
  <col on="sheet1">1</col>
  <col on="sheet2">1</col>
  <node title="a11">
    <col on="sheet1">1</col>
    <col on="sheet2">1</col>
    <node title="a111">
      <col on="sheet1">1</col>
      <col on="sheet2">1</col>
    </node>
    <node title="a112">
      <col on="sheet1">2</col>
      <col on="sheet2">2</col>
    </node>
    <node title="a113"><!-- different here -->
      <col on="sheet2">3</col>
    </node>
  </node>
  <node title="a12">
    <col on="sheet1">3</col>
    <col on="sheet2">4</col>
    <node title="a121">
      <col on="sheet1">3</col>
      <col on="sheet2">4</col>
    </node>
    <node title="a122">
      <col on="sheet1">4</col>
      <col on="sheet2">5</col>
    </node>
  </node>
  <!-- and so on -->
</node>

理想情况下，我希望能够在我什至从 Excel XML 构建这三个结构之前进行合并（如果您让我开始）这，那就太好了）。但由于我不知道如何做到这一点，因此在构建树后进行合并（即：上述情况）就可以了。

谢谢你的时间。 :)

原文

I have an XML document based what Excel produces when saving as "XML Spreadsheet 2003 (*.xml)".

The spreadsheet itself contains a header section with a hierarchy of labels:

 | A     B     C     D     E     F     G     H     I
-+-----------------------------------------------------
1| a1                                  a2
2| a11         a12         a13         a21   a22
3| a111  a112  a121  a122  a131  a132        a221  a222

This hierarchy is present on all sheets in the workbook, and looks more or less the same everywhere.

Excel XML works exactly like ordinary HTML tables. (<row>s that contain <cell>s). I have been able to transform everything into such a tree structure:

<node title="a1" col="1">
  <node title="a11" col="1">
    <node title="a111" col="1"/>
    <node title="a112" col="2"/>
  </node>
  <node title="a12" col="3">
    <node title="a121" col="3" />
    <node title="a122" col="4" />
  </node>
  <!-- and so on -->
</node>

But here is the complication:

there is more than one worksheet, so there is a tree for each of them
the hierarchy may be slightly different on each sheet, the trees will not be equal (for example, sheet 2 may have "a113", while the others don't)
tree depth is not explicitly limited
the labels however are meant to be the same across all sheets, which means they can be used for grouping

I'd like to merge these separate trees into one that looks like this:

<node title="a1">
  <col on="sheet1">1</col>
  <col on="sheet2">1</col>
  <node title="a11">
    <col on="sheet1">1</col>
    <col on="sheet2">1</col>
    <node title="a111">
      <col on="sheet1">1</col>
      <col on="sheet2">1</col>
    </node>
    <node title="a112">
      <col on="sheet1">2</col>
      <col on="sheet2">2</col>
    </node>
    <node title="a113"><!-- different here -->
      <col on="sheet2">3</col>
    </node>
  </node>
  <node title="a12">
    <col on="sheet1">3</col>
    <col on="sheet2">4</col>
    <node title="a121">
      <col on="sheet1">3</col>
      <col on="sheet2">4</col>
    </node>
    <node title="a122">
      <col on="sheet1">4</col>
      <col on="sheet2">5</col>
    </node>
  </node>
  <!-- and so on -->
</node>

Ideally I'd like to be able to do the merge before I even build the three structure from the Excel XML (if you get me started on this, it'd be great). But since I have no idea how I would do this, a merge after the trees have been built (i.e.: the situation described above) will be fine.

Thanks for your time. :)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

唯憾梦倾城 2024-07-22 22:01:56

这里是 XSLT 1.0 中的一种可能的解决方案：

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="/*">
      <t>
        <xsl:apply-templates
           select="node[@title='a1'][1]">
          <xsl:with-param name="pOther"
            select="node[@title='a1'][2]"/>
        </xsl:apply-templates>
      </t>
    </xsl:template>

    <xsl:template match="node">
      <xsl:param name="pOther"/>

      <node title="{@title}">
        <col on="sheet1">
          <xsl:value-of select="@col"/>
        </col>
          <xsl:choose>
            <xsl:when test="not($pOther)">
              <xsl:apply-templates mode="copy">
                <xsl:with-param name="pSheet" select="'sheet1'"/>
              </xsl:apply-templates>
            </xsl:when>
            <xsl:otherwise>
              <col on="sheet2">
                <xsl:value-of select="$pOther/@col"/>
              </col>
              <xsl:for-each select=
                "node[@title = $pOther/node/@title]">

                <xsl:apply-templates select=".">
                  <xsl:with-param name="pOther" select=
                   "$pOther/node[@title = current()/@title]"/>
                </xsl:apply-templates>
              </xsl:for-each>

              <xsl:apply-templates mode="copy" select=
                "node[not(@title = $pOther/node/@title)]">
                <xsl:with-param name="pSheet" select="'sheet1'"/>
              </xsl:apply-templates>

              <xsl:apply-templates mode="copy" select=
                "$pOther/node[not(@title = current()/node/@title)]">
                <xsl:with-param name="pSheet" select="'sheet2'"/>
              </xsl:apply-templates>
            </xsl:otherwise>
          </xsl:choose>
      </node>
    </xsl:template>

    <xsl:template match="node" mode="copy">
      <xsl:param name="pSheet"/>

      <node title="{@title}">
        <col on="{$pSheet}">
          <xsl:value-of select="@col"/>
        </col>

        <xsl:apply-templates select="node" mode="copy">
          <xsl:with-param name="pSheet" select="$pSheet"/>
        </xsl:apply-templates>
      </node>
    </xsl:template>
</xsl:stylesheet>

当上述转换应用于此 XML 文档时（两个 XML 文档在一个公共顶部节点下的串联 - 左为供读者练习:) ):

<t>
    <node title="a1" col="1">
        <node title="a11" col="1">
            <node title="a111" col="1"/>
            <node title="a112" col="2"/>
        </node>
        <node title="a12" col="3">
            <node title="a121" col="3" />
            <node title="a122" col="4" />
        </node>
        <!-- and so on -->
    </node>
    <node title="a1" col="1">
        <node title="a11" col="1">
            <node title="a111" col="1"/>
            <node title="a112" col="2"/>
            <node title="a113" col="3"/>
        </node>
        <node title="a12" col="4">
            <node title="a121" col="4" />
            <node title="a122" col="5" />
        </node>
        <!-- and so on -->
    </node>
</t>

产生了想要的结果：

<t>
    <node title="a1">
        <col on="sheet1">1</col>
        <col on="sheet2">1</col>
        <node title="a11">
            <col on="sheet1">1</col>
            <col on="sheet2">1</col>
            <node title="a111">
                <col on="sheet1">1</col>
                <col on="sheet2">1</col>
            </node>
            <node title="a112">
                <col on="sheet1">2</col>
                <col on="sheet2">2</col>
            </node>
            <node title="a113">
                <col on="sheet2">3</col>
            </node>
        </node>
        <node title="a12">
            <col on="sheet1">3</col>
            <col on="sheet2">4</col>
            <node title="a121">
                <col on="sheet1">3</col>
                <col on="sheet2">4</col>
            </node>
            <node title="a122">
                <col on="sheet1">4</col>
                <col on="sheet2">5</col>
            </node>
        </node>
    </node>
</t>

请注意以下内容：

我们假设两个顶部节点元素都有“a1 " 作为其 title 属性的值。这很容易推广。
匹配node的模板有一个名为pOther的参数，它是另一个文档中名为node的对应元素。仅当 $pOther 存在时，才会应用此模板。
当不存在名为 node 的对应元素时，将应用另一个模板，也匹配 node，但采用 copy 模式。该模板有一个名为pSheet的参数，其值为该元素所属的sheet名称（字符串）。

Here is one possible solution in XSLT 1.0:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="/*">
      <t>
        <xsl:apply-templates
           select="node[@title='a1'][1]">
          <xsl:with-param name="pOther"
            select="node[@title='a1'][2]"/>
        </xsl:apply-templates>
      </t>
    </xsl:template>

    <xsl:template match="node">
      <xsl:param name="pOther"/>

      <node title="{@title}">
        <col on="sheet1">
          <xsl:value-of select="@col"/>
        </col>
          <xsl:choose>
            <xsl:when test="not($pOther)">
              <xsl:apply-templates mode="copy">
                <xsl:with-param name="pSheet" select="'sheet1'"/>
              </xsl:apply-templates>
            </xsl:when>
            <xsl:otherwise>
              <col on="sheet2">
                <xsl:value-of select="$pOther/@col"/>
              </col>
              <xsl:for-each select=
                "node[@title = $pOther/node/@title]">

                <xsl:apply-templates select=".">
                  <xsl:with-param name="pOther" select=
                   "$pOther/node[@title = current()/@title]"/>
                </xsl:apply-templates>
              </xsl:for-each>

              <xsl:apply-templates mode="copy" select=
                "node[not(@title = $pOther/node/@title)]">
                <xsl:with-param name="pSheet" select="'sheet1'"/>
              </xsl:apply-templates>

              <xsl:apply-templates mode="copy" select=
                "$pOther/node[not(@title = current()/node/@title)]">
                <xsl:with-param name="pSheet" select="'sheet2'"/>
              </xsl:apply-templates>
            </xsl:otherwise>
          </xsl:choose>
      </node>
    </xsl:template>

    <xsl:template match="node" mode="copy">
      <xsl:param name="pSheet"/>

      <node title="{@title}">
        <col on="{$pSheet}">
          <xsl:value-of select="@col"/>
        </col>

        <xsl:apply-templates select="node" mode="copy">
          <xsl:with-param name="pSheet" select="$pSheet"/>
        </xsl:apply-templates>
      </node>
    </xsl:template>
</xsl:stylesheet>

When the above transformation is applied on this XML document (the concatenation of the two XML documents under a common top node -- left as an exercise for the reader :) ):

<t>
    <node title="a1" col="1">
        <node title="a11" col="1">
            <node title="a111" col="1"/>
            <node title="a112" col="2"/>
        </node>
        <node title="a12" col="3">
            <node title="a121" col="3" />
            <node title="a122" col="4" />
        </node>
        <!-- and so on -->
    </node>
    <node title="a1" col="1">
        <node title="a11" col="1">
            <node title="a111" col="1"/>
            <node title="a112" col="2"/>
            <node title="a113" col="3"/>
        </node>
        <node title="a12" col="4">
            <node title="a121" col="4" />
            <node title="a122" col="5" />
        </node>
        <!-- and so on -->
    </node>
</t>

The wanted result is produced:

<t>
    <node title="a1">
        <col on="sheet1">1</col>
        <col on="sheet2">1</col>
        <node title="a11">
            <col on="sheet1">1</col>
            <col on="sheet2">1</col>
            <node title="a111">
                <col on="sheet1">1</col>
                <col on="sheet2">1</col>
            </node>
            <node title="a112">
                <col on="sheet1">2</col>
                <col on="sheet2">2</col>
            </node>
            <node title="a113">
                <col on="sheet2">3</col>
            </node>
        </node>
        <node title="a12">
            <col on="sheet1">3</col>
            <col on="sheet2">4</col>
            <node title="a121">
                <col on="sheet1">3</col>
                <col on="sheet2">4</col>
            </node>
            <node title="a122">
                <col on="sheet1">4</col>
                <col on="sheet2">5</col>
            </node>
        </node>
    </node>
</t>

Do note the following:

We suppose that both top node elements have "a1" as the value of their title attribute. This can easily be generalized.
The template matching node has a parameter named pOther, which is the corresponding element named node from the other document. This template is applied - to only if $pOther exists.
When no corresponding element named node exists, another template, also matching node, but in mode copy is applied. This template has a parameter named pSheet, the value of which is the sheet name (string) this element belongs to.