使用 XSLT/XPath 查找有向无环图 (DAG) 最小元素(顶点)?

发布于 2024-07-19 04:51:08 字数 5304 浏览 6 评论 0原文

我有一个 XML 文件,它编码 有向无环图 (DAG) 表示部分订单。 此类图对于指定依赖关系和查找关键路径等事情很有用。 出于好奇,我当前的应用程序是为 构建系统指定组件依赖项,因此顶点是组件,边指定编译时依赖项。 这是一个简单的示例:

<?xml version="1.0"?>
<dag>
    <vertex name="A">
        <directed-edge-to vertex="C"/>
    </vertex>
    <vertex name="B">
        <directed-edge-to vertex="C"/>
        <directed-edge-to vertex="D"/>
    </vertex>
    <vertex name="C">
        <directed-edge-to vertex="E"/>
    </vertex>
    <vertex name="D">
        <directed-edge-to vertex="E"/>
    </vertex>
    <vertex name="E">
        <directed-edge-to vertex="G"/>
    </vertex>
    <vertex name="F">
        <directed-edge-to vertex="G"/>
    </vertex>
    <vertex name="G"/>
</dag>

这个 DAG 可以这样绘制:


(来源:iparelan.com

我想申请 < a href="http://www.w3.org/TR/xslt" rel="nofollow noreferrer">XSLT 生成另一个 XML 的样式表 仅包含与最小元素相对应的顶点的文档部分订单。 即那些没有传入边的顶点。 示例图的最小顶点集是{A, B, F}。 对于我的构建依赖项应用程序,找到这个集合很有价值,因为我知道如果我构建这个集合的成员,那么我的项目中的所有内容都将被构建。

这是我当前的样式表解决方案(我使用 Apache Ant 的 xslt 任务在 Java 上的 Xalan 上运行该解决方案)。 一个关键的观察结果是,最小顶点不会在任何 directed-edge-to 元素中引用:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xalan="http://xml.apache.org/xslt"
                exclude-result-prefixes="xalan">
    <xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>

    <xsl:template match="dag">
        <minimal-vertices>
            <xsl:for-each select="//vertex">
                <xsl:if test="not(//vertex/directed-edge-to[@vertex=current()/@name])">
                    <minimal-vertex name="{@name}"/>
                </xsl:if>
            </xsl:for-each>
        </minimal-vertices>
    </xsl:template>
</xsl:stylesheet>

应用此样式表会产生以下输出(我认为这是正确的):

<?xml version="1.0" encoding="UTF-8"?>
<minimal-vertices>
    <minimal-vertex name="A"/>
    <minimal-vertex name="B"/>
    <minimal-vertex name="F"/>
</minimal-vertices>

问题是,我对此解决方案并不完全满意。 我想知道是否有一种方法可以将 for-eachselecttest 结合起来if 使用 XPath 语法。

我想写一些类似的东西:

<xsl:for-each select="//vertex[not(//vertex/directed-edge-to[@vertex=current()/@name])]">

但这并不能达到我想要的效果,因为 current() 函数不引用外部选择的节点//顶点表达式。

到目前为止,我的解决方案使用 XPath 1.0XSLT 1.0 语法,尽管我对 XPath 2.0XSLT 2.0 语法。

如果您愿意,这里是 Ant 构建脚本:

<?xml version="1.0"?>
<project name="minimal-dag" default="default">
    <target name="default">
        <xslt in="dag.xml" out="minimal-vertices.xml" style="find-minimal-vertices.xsl"/>
    </target>
    <target name="dot">
        <xslt in="dag.xml" out="dag.dot" style="xml-to-dot.xsl"/>
    </target>
</project>

dot 目标生成 Graphviz 用于渲染图形的语言代码。 这是xml-to-dot.xsl

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xalan="http://xml.apache.org/xslt"
                exclude-result-prefixes="xalan">
    <xsl:output method="text"/>

    <xsl:template match="dag">
        digraph {
        rankdir="BT";
        node [style="filled", fillcolor="cyan", fontname="Helvetica"];
        <xsl:apply-templates select="//directed-edge-to"/>
        }
    </xsl:template>

    <xsl:template match="directed-edge-to">
        <xsl:value-of select="concat(ancestor::vertex/@name, '->', @vertex, ';')"/>
    </xsl:template>
</xsl:stylesheet>

I have an XML file that encodes a
directed acyclic graph
(DAG)
that represents a partial order. Such graphs are useful for things like specifying dependencies and finding critical paths. For the curious, my current application is to specify component dependencies for a build system, so vertices are components and edges specify compile-time dependencies. Here is a simple example:

<?xml version="1.0"?>
<dag>
    <vertex name="A">
        <directed-edge-to vertex="C"/>
    </vertex>
    <vertex name="B">
        <directed-edge-to vertex="C"/>
        <directed-edge-to vertex="D"/>
    </vertex>
    <vertex name="C">
        <directed-edge-to vertex="E"/>
    </vertex>
    <vertex name="D">
        <directed-edge-to vertex="E"/>
    </vertex>
    <vertex name="E">
        <directed-edge-to vertex="G"/>
    </vertex>
    <vertex name="F">
        <directed-edge-to vertex="G"/>
    </vertex>
    <vertex name="G"/>
</dag>

This DAG may be drawn like this:


(source: iparelan.com)

I'd like to apply an XSLT stylesheet that produces another XML
document that contains only the vertices that correspond to minimal elements of the partial order. That is, those vertices that have no incoming edges. The set of minimal vertices for the example graph is {A, B, F}. For my build dependency application, finding this set is valuable because I know that if I build the members of this set, then everything in my project will be built.

Here is my current stylesheet solution (I'm running this with Xalan on Java using Apache Ant's xslt task). A key observation is that a minimal vertex will not be referenced in any directed-edge-to element:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xalan="http://xml.apache.org/xslt"
                exclude-result-prefixes="xalan">
    <xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>

    <xsl:template match="dag">
        <minimal-vertices>
            <xsl:for-each select="//vertex">
                <xsl:if test="not(//vertex/directed-edge-to[@vertex=current()/@name])">
                    <minimal-vertex name="{@name}"/>
                </xsl:if>
            </xsl:for-each>
        </minimal-vertices>
    </xsl:template>
</xsl:stylesheet>

Applying this stylesheet produces the following output (which I believe is correct):

<?xml version="1.0" encoding="UTF-8"?>
<minimal-vertices>
    <minimal-vertex name="A"/>
    <minimal-vertex name="B"/>
    <minimal-vertex name="F"/>
</minimal-vertices>

The thing is, I'm not completely satisfied with this solution. I'm wondering if there is a way to combine the select of the for-each and the test of the if with XPath syntax.

I want to write something like:

<xsl:for-each select="//vertex[not(//vertex/directed-edge-to[@vertex=current()/@name])]">

But that does not do what I want because the current() function does not reference the nodes selected by the outer //vertex expression.

Thusfar, my solution uses XPath 1.0 and XSLT 1.0 syntax, though I'm open to XPath 2.0 and XSLT 2.0 syntax as well.

Here's the Ant build script if you like:

<?xml version="1.0"?>
<project name="minimal-dag" default="default">
    <target name="default">
        <xslt in="dag.xml" out="minimal-vertices.xml" style="find-minimal-vertices.xsl"/>
    </target>
    <target name="dot">
        <xslt in="dag.xml" out="dag.dot" style="xml-to-dot.xsl"/>
    </target>
</project>

The dot target generates Graphviz Dot language code for rendering the graph. Here is xml-to-dot.xsl:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xalan="http://xml.apache.org/xslt"
                exclude-result-prefixes="xalan">
    <xsl:output method="text"/>

    <xsl:template match="dag">
        digraph {
        rankdir="BT";
        node [style="filled", fillcolor="cyan", fontname="Helvetica"];
        <xsl:apply-templates select="//directed-edge-to"/>
        }
    </xsl:template>

    <xsl:template match="directed-edge-to">
        <xsl:value-of select="concat(ancestor::vertex/@name, '->', @vertex, ';')"/>
    </xsl:template>
</xsl:stylesheet>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

惟欲睡 2024-07-26 04:51:08

您可以在 = 运算符上利用 XPath 的隐式存在量化:

<xsl:for-each select="//vertex[not(@name = //vertex/directed-edge-to/@vertex)]">

当您使用六个比较运算符(=!=<<=>>=)来比较节点集,如果节点集中的任何节点满足条件,则表达式将返回 true。 将一个节点集与另一个节点集进行比较时,如果第一个节点集中的任何节点与第二个节点集中的任何节点进行比较时满足条件,则表达式返回 true。 XPath 2.0 引入了六个不执行这种存在量化的新运算符(eqneltlegtge)。 但就您而言,您需要使用“=”来获得存在量化。

当然请注意,您仍然需要像以前一样使用 not() 函数。 大多数时候,最好避免使用 != 运算符。 如果您在这里使用它而不是 not(),那么如果有任何 @vertex 属性不等于 @name ,它将返回 true code> value,这不是你的本意。 (如果任一节点集为空,则它将返回 false,因为与空节点集的比较始终返回 false。)

如果您想使用 eq 代替,那么您必须这样做就像您所做的那样:将条件从迭代中分离出来,以便您可以绑定current()。 但在 XPath 2.0 中,您可以在表达式中执行此操作:

<xsl:for-each select="for $v in //vertex
                      return $v[not(//directed-edge-to[@vertex eq $v/@name])]">

当您的条件不是简单的相等比较(因此无法使用“=”进行存在量化)时,这非常有用。 例如:starts-with(@vertex, $v/@name)

XPath 2.0 还具有执行存在量化的显式方法。 我们可以这样写:

<xsl:for-each select="//vertex[not(some $e in //directed-edge-to
                                   satisfies @name eq $e/@vertex)]">

除了“some”语法之外,XPath 2.0 还提供了相应的“everyfor 表达式。 >”用于执行通用量化的语法。

您还可以使用模板规则,而不是使用 for-each,它更加模块化(并且功能更强大):

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <minimal-vertices>
      <xsl:apply-templates/>
    </minimal-vertices>
  </xsl:template>

  <!-- Copy vertex elements that have no arrows pointing to them -->
  <xsl:template match="vertex[not(@name = //directed-edge-to/@vertex)]">
    <minimal-vertex name="{@name}"/>
  </xsl:template>

</xsl:stylesheet>

同样,在这种情况下,我们依赖于 =< 的存在量化/代码>。

XSLT 1.0 禁止在模式(即 match 属性)中使用 current() 函数,但 XSLT 2.0 允许。 在这种情况下,current() 指的是当前正在匹配的节点。 因此,在 XSLT 2.0 中,我们也可以这样编写(无需使用 for 表达式):

<xsl:template match="vertex[not(//directed-edge-to[@vertex eq current()/@name])]">

请注意,此模式本质上与您尝试在 for-each< 中使用的表达式相同/code>,但是虽然它在 for-each 中没有执行您想要的操作,但它确实在模式中执行您想要的操作(因为 current( ) 绑定到的是不同的)。

最后,我将添加另一种变体,它在某种程度上简化了逻辑(删除 not())。 这也可以追溯到使用 XSLT 1.0:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <minimal-vertices>
      <xsl:apply-templates/>
    </minimal-vertices>
  </xsl:template>

  <!-- By default, copy vertex elements -->
  <xsl:template match="vertex">
    <minimal-vertex name="{@name}"/>
  </xsl:template>

  <!-- But strip out vertices with incoming arrows -->
  <xsl:template match="vertex[@name = //directed-edge-to/@vertex]"/>

</xsl:stylesheet>

如果您不喜欢输出空格,请为文本节点添加一个空规则,这样它们就会被删除(覆盖文本节点的默认规则,即复制它们) :

<xsl:template match="text()"/>

或者您可以在应用模板的节点上更有选择性:

<xsl:apply-templates select="/dag/vertex"/>

您采用的方法部分取决于品味,部分取决于样式表和预期数据的更广泛上下文(输入结构可能变化多少等) 。

我知道我远远超出了你的要求,但我希望你至少觉得这很有趣。 :-)

You can take advantage of XPath's implicit existential quantification on the = operator:

<xsl:for-each select="//vertex[not(@name = //vertex/directed-edge-to/@vertex)]">

When you use any of the six comparison operators (=, !=, <, <=, >, and >=) to compare a node-set, the expression will return true if any node in the node-set satisfies the condition. When comparing one node-set with another, the expression returns true if any node in the first node-set satisfies the condition when compared with any node in the second node-set. XPath 2.0 introduces six new operators that don't perform this existential quantification (eq, ne, lt, le, gt, and ge). But in your case, you'll want to use "=" to get that existential quantification.

Note of course, that you'll still want to use the not() function as you were doing. Most of the time, it's good to avoid the != operator. If you used it here instead of not(), then it would return true if there are any @vertex attributes that are not equal to the @name value, which is not your intention. (And if either node-set is empty, then it would return false, as comparisons with empty node-sets always return false.)

If you want to use eq instead, then you'd have to do something like you did: separate out the conditional from the iteration so you could bind current(). But in XPath 2.0, you can do this within an expression:

<xsl:for-each select="for $v in //vertex
                      return $v[not(//directed-edge-to[@vertex eq $v/@name])]">

This is useful for when your condition isn't a simple equality comparison (and thus can't be existentially quantified using "="). For example: starts-with(@vertex, $v/@name).

XPath 2.0 also has an explicit way of performing existential quantification. Instead of the for expression above, we could have written this:

<xsl:for-each select="//vertex[not(some $e in //directed-edge-to
                                   satisfies @name eq $e/@vertex)]">

In addition to the "some" syntax, XPath 2.0 also supplies a corresponding "every" syntax for performing universal quantification.

Rather than using for-each, you could also use template rules, which are more modular (and powerful):

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <minimal-vertices>
      <xsl:apply-templates/>
    </minimal-vertices>
  </xsl:template>

  <!-- Copy vertex elements that have no arrows pointing to them -->
  <xsl:template match="vertex[not(@name = //directed-edge-to/@vertex)]">
    <minimal-vertex name="{@name}"/>
  </xsl:template>

</xsl:stylesheet>

Again, in this case, we're relying on the existential quantification of =.

XSLT 1.0 prohibits use of the current() function in patterns, i.e., in the match attribute, but XSLT 2.0 allows it. In that case, current() refers to the node currently being matched. So in XSLT 2.0, we could also write this (without having to use a for expression):

<xsl:template match="vertex[not(//directed-edge-to[@vertex eq current()/@name])]">

Note that this pattern is essentially the same as the expression you tried to use in for-each, but whereas it doesn't do what you want in for-each, it does do what you want in the pattern (because what current() binds to is different).

Finally, I'll add one more variation that in some ways simplifies the logic (removing not()). This also goes back to using XSLT 1.0:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <minimal-vertices>
      <xsl:apply-templates/>
    </minimal-vertices>
  </xsl:template>

  <!-- By default, copy vertex elements -->
  <xsl:template match="vertex">
    <minimal-vertex name="{@name}"/>
  </xsl:template>

  <!-- But strip out vertices with incoming arrows -->
  <xsl:template match="vertex[@name = //directed-edge-to/@vertex]"/>

</xsl:stylesheet>

If you don't like the whitespace being output, add an empty rule for text nodes, so they'll get stripped out (overriding the default rule for text nodes, which is to copy them):

<xsl:template match="text()"/>

Or you could just be more selective in what nodes you apply templates to:

<xsl:apply-templates select="/dag/vertex"/>

Which approach you take is partially dependent on taste, partially dependent on the wider context of your stylesheet and expected data (how much the input structure might vary, etc.).

I know I went way beyond what you were asking for, but I hope you at least found this interesting. :-)

画▽骨i 2024-07-26 04:51:08

这样的 XPath 1.0 表达式是

       /*/vertex[not(@name = /*/vertex /directed-edge-to/@vertex)]

然后将其放入 XSLT 样式表中

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="/">
      <minimal-vertices>
          <xsl:for-each select=
          "/*/vertex[not(@name = /*/vertex/directed-edge-to/@vertex)]"
          >
           <minimal-vertex name="{@name}"/>
          </xsl:for-each>
      </minimal-vertices>
    </xsl:template>
</xsl:stylesheet>

当此样式表应用于最初提供的XML 文档

<dag>
    <vertex name="A">
        <directed-edge-to vertex="C"/>
    </vertex>
    <vertex name="B">
        <directed-edge-to vertex="C"/>
        <directed-edge-to vertex="D"/>
    </vertex>
    <vertex name="C">
        <directed-edge-to vertex="E"/>
    </vertex>
    <vertex name="D">
        <directed-edge-to vertex="E"/>
    </vertex>
    <vertex name="E">
        <directed-edge-to vertex="G"/>
    </vertex>
    <vertex name="F">
        <directed-edge-to vertex="G"/>
    </vertex>
    <vertex name="G"/>
</dag>

生成了想要的结果

<minimal-vertices>
  <minimal-vertex name="A" />
  <minimal-vertex name="B" />
  <minimal-vertex name="F" />
</minimal-vertices>

请注意XSLT 中提供了遍历完整(可能是循环)图的解决方案 此处

One such XPath 1.0 expression is:

        /*/vertex[not(@name = /*/vertex/directed-edge-to/@vertex)]

Then just put it into an XSLT stylesheet like that:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="/">
      <minimal-vertices>
          <xsl:for-each select=
          "/*/vertex[not(@name = /*/vertex/directed-edge-to/@vertex)]"
          >
           <minimal-vertex name="{@name}"/>
          </xsl:for-each>
      </minimal-vertices>
    </xsl:template>
</xsl:stylesheet>

When this stylesheet is applied on the originally-provided XML document:

<dag>
    <vertex name="A">
        <directed-edge-to vertex="C"/>
    </vertex>
    <vertex name="B">
        <directed-edge-to vertex="C"/>
        <directed-edge-to vertex="D"/>
    </vertex>
    <vertex name="C">
        <directed-edge-to vertex="E"/>
    </vertex>
    <vertex name="D">
        <directed-edge-to vertex="E"/>
    </vertex>
    <vertex name="E">
        <directed-edge-to vertex="G"/>
    </vertex>
    <vertex name="F">
        <directed-edge-to vertex="G"/>
    </vertex>
    <vertex name="G"/>
</dag>

The wanted result is produced:

<minimal-vertices>
  <minimal-vertex name="A" />
  <minimal-vertex name="B" />
  <minimal-vertex name="F" />
</minimal-vertices>

Do note: A solution for traversing full (maybe cyclic) graphs is available in XSLT here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文