使用 XSLT/XPath 查找有向无环图 (DAG) 最小元素(顶点)?
我有一个 XML 文件,它编码 有向无环图 (DAG) 表示部分订单。 此类图对于指定依赖关系和查找关键路径等事情很有用。 出于好奇,我当前的应用程序是为 构建系统指定组件依赖项,因此顶点是组件,边指定编译时依赖项。 这是一个简单的示例:
<?xml version="1.0"?>
<dag>
<vertex name="A">
<directed-edge-to vertex="C"/>
</vertex>
<vertex name="B">
<directed-edge-to vertex="C"/>
<directed-edge-to vertex="D"/>
</vertex>
<vertex name="C">
<directed-edge-to vertex="E"/>
</vertex>
<vertex name="D">
<directed-edge-to vertex="E"/>
</vertex>
<vertex name="E">
<directed-edge-to vertex="G"/>
</vertex>
<vertex name="F">
<directed-edge-to vertex="G"/>
</vertex>
<vertex name="G"/>
</dag>
这个 DAG 可以这样绘制:
(来源:iparelan.com)
我想申请 < a href="http://www.w3.org/TR/xslt" rel="nofollow noreferrer">XSLT 生成另一个 XML 的样式表 仅包含与最小元素相对应的顶点的文档部分订单。 即那些没有传入边的顶点。 示例图的最小顶点集是{A, B, F}
。 对于我的构建依赖项应用程序,找到这个集合很有价值,因为我知道如果我构建这个集合的成员,那么我的项目中的所有内容都将被构建。
这是我当前的样式表解决方案(我使用 Apache Ant 的 xslt
任务在 Java 上的 Xalan 上运行该解决方案)。 一个关键的观察结果是,最小顶点不会在任何 directed-edge-to
元素中引用:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xslt"
exclude-result-prefixes="xalan">
<xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>
<xsl:template match="dag">
<minimal-vertices>
<xsl:for-each select="//vertex">
<xsl:if test="not(//vertex/directed-edge-to[@vertex=current()/@name])">
<minimal-vertex name="{@name}"/>
</xsl:if>
</xsl:for-each>
</minimal-vertices>
</xsl:template>
</xsl:stylesheet>
应用此样式表会产生以下输出(我认为这是正确的):
<?xml version="1.0" encoding="UTF-8"?>
<minimal-vertices>
<minimal-vertex name="A"/>
<minimal-vertex name="B"/>
<minimal-vertex name="F"/>
</minimal-vertices>
问题是,我对此解决方案并不完全满意。 我想知道是否有一种方法可以将 for-each
的 select
和 test
结合起来if 使用 XPath 语法。
我想写一些类似的东西:
<xsl:for-each select="//vertex[not(//vertex/directed-edge-to[@vertex=current()/@name])]">
但这并不能达到我想要的效果,因为 current()
函数不引用外部选择的节点//顶点
表达式。
到目前为止,我的解决方案使用 XPath 1.0 和 XSLT 1.0 语法,尽管我对 XPath 2.0 和 XSLT 2.0 语法。
如果您愿意,这里是 Ant 构建脚本:
<?xml version="1.0"?>
<project name="minimal-dag" default="default">
<target name="default">
<xslt in="dag.xml" out="minimal-vertices.xml" style="find-minimal-vertices.xsl"/>
</target>
<target name="dot">
<xslt in="dag.xml" out="dag.dot" style="xml-to-dot.xsl"/>
</target>
</project>
dot
目标生成 Graphviz 点 用于渲染图形的语言代码。 这是xml-to-dot.xsl
:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xslt"
exclude-result-prefixes="xalan">
<xsl:output method="text"/>
<xsl:template match="dag">
digraph {
rankdir="BT";
node [style="filled", fillcolor="cyan", fontname="Helvetica"];
<xsl:apply-templates select="//directed-edge-to"/>
}
</xsl:template>
<xsl:template match="directed-edge-to">
<xsl:value-of select="concat(ancestor::vertex/@name, '->', @vertex, ';')"/>
</xsl:template>
</xsl:stylesheet>
I have an XML file that encodes a
directed acyclic graph
(DAG) that represents a partial order. Such graphs are useful for things like specifying dependencies and finding critical paths. For the curious, my current application is to specify component dependencies for a build system, so vertices are components and edges specify compile-time dependencies. Here is a simple example:
<?xml version="1.0"?>
<dag>
<vertex name="A">
<directed-edge-to vertex="C"/>
</vertex>
<vertex name="B">
<directed-edge-to vertex="C"/>
<directed-edge-to vertex="D"/>
</vertex>
<vertex name="C">
<directed-edge-to vertex="E"/>
</vertex>
<vertex name="D">
<directed-edge-to vertex="E"/>
</vertex>
<vertex name="E">
<directed-edge-to vertex="G"/>
</vertex>
<vertex name="F">
<directed-edge-to vertex="G"/>
</vertex>
<vertex name="G"/>
</dag>
This DAG may be drawn like this:
(source: iparelan.com)
I'd like to apply an XSLT stylesheet that produces another XML
document that contains only the vertices that correspond to minimal elements of the partial order. That is, those vertices that have no incoming edges. The set of minimal vertices for the example graph is {A, B, F}
. For my build dependency application, finding this set is valuable because I know that if I build the members of this set, then everything in my project will be built.
Here is my current stylesheet solution (I'm running this with Xalan on Java using Apache Ant's xslt
task). A key observation is that a minimal vertex will not be referenced in any directed-edge-to
element:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xslt"
exclude-result-prefixes="xalan">
<xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>
<xsl:template match="dag">
<minimal-vertices>
<xsl:for-each select="//vertex">
<xsl:if test="not(//vertex/directed-edge-to[@vertex=current()/@name])">
<minimal-vertex name="{@name}"/>
</xsl:if>
</xsl:for-each>
</minimal-vertices>
</xsl:template>
</xsl:stylesheet>
Applying this stylesheet produces the following output (which I believe is correct):
<?xml version="1.0" encoding="UTF-8"?>
<minimal-vertices>
<minimal-vertex name="A"/>
<minimal-vertex name="B"/>
<minimal-vertex name="F"/>
</minimal-vertices>
The thing is, I'm not completely satisfied with this solution. I'm wondering if there is a way to combine the select
of the for-each
and the test
of the if
with XPath syntax.
I want to write something like:
<xsl:for-each select="//vertex[not(//vertex/directed-edge-to[@vertex=current()/@name])]">
But that does not do what I want because the current()
function does not reference the nodes selected by the outer //vertex
expression.
Thusfar, my solution uses XPath 1.0 and XSLT 1.0 syntax, though I'm open to XPath 2.0 and XSLT 2.0 syntax as well.
Here's the Ant build script if you like:
<?xml version="1.0"?>
<project name="minimal-dag" default="default">
<target name="default">
<xslt in="dag.xml" out="minimal-vertices.xml" style="find-minimal-vertices.xsl"/>
</target>
<target name="dot">
<xslt in="dag.xml" out="dag.dot" style="xml-to-dot.xsl"/>
</target>
</project>
The dot
target generates Graphviz Dot language code for rendering the graph. Here is xml-to-dot.xsl
:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xslt"
exclude-result-prefixes="xalan">
<xsl:output method="text"/>
<xsl:template match="dag">
digraph {
rankdir="BT";
node [style="filled", fillcolor="cyan", fontname="Helvetica"];
<xsl:apply-templates select="//directed-edge-to"/>
}
</xsl:template>
<xsl:template match="directed-edge-to">
<xsl:value-of select="concat(ancestor::vertex/@name, '->', @vertex, ';')"/>
</xsl:template>
</xsl:stylesheet>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以在
=
运算符上利用 XPath 的隐式存在量化:当您使用六个比较运算符(
=
、!=
、<
、<=
、>
和>=
)来比较节点集,如果节点集中的任何节点满足条件,则表达式将返回 true。 将一个节点集与另一个节点集进行比较时,如果第一个节点集中的任何节点与第二个节点集中的任何节点进行比较时满足条件,则表达式返回 true。 XPath 2.0 引入了六个不执行这种存在量化的新运算符(eq
、ne
、lt
、le
、gt
和ge
)。 但就您而言,您需要使用“=
”来获得存在量化。当然请注意,您仍然需要像以前一样使用
not()
函数。 大多数时候,最好避免使用!=
运算符。 如果您在这里使用它而不是not()
,那么如果有任何@vertex
属性不等于@name
,它将返回 true code> value,这不是你的本意。 (如果任一节点集为空,则它将返回 false,因为与空节点集的比较始终返回 false。)如果您想使用
eq
代替,那么您必须这样做就像您所做的那样:将条件从迭代中分离出来,以便您可以绑定current()
。 但在 XPath 2.0 中,您可以在表达式中执行此操作:当您的条件不是简单的相等比较(因此无法使用“
=
”进行存在量化)时,这非常有用。 例如:starts-with(@vertex, $v/@name)
。XPath 2.0 还具有执行存在量化的显式方法。 我们可以这样写:
除了“
some
”语法之外,XPath 2.0 还提供了相应的“everyfor
表达式。 >”用于执行通用量化的语法。您还可以使用模板规则,而不是使用
for-each
,它更加模块化(并且功能更强大):同样,在这种情况下,我们依赖于
=< 的存在量化/代码>。
XSLT 1.0 禁止在模式(即
match
属性)中使用current()
函数,但 XSLT 2.0 允许。 在这种情况下,current()
指的是当前正在匹配的节点。 因此,在 XSLT 2.0 中,我们也可以这样编写(无需使用for
表达式):请注意,此模式本质上与您尝试在
for-each< 中使用的表达式相同/code>,但是虽然它在
for-each
中没有执行您想要的操作,但它确实在模式中执行您想要的操作(因为current( )
绑定到的是不同的)。最后,我将添加另一种变体,它在某种程度上简化了逻辑(删除
not()
)。 这也可以追溯到使用 XSLT 1.0:如果您不喜欢输出空格,请为文本节点添加一个空规则,这样它们就会被删除(覆盖文本节点的默认规则,即复制它们) :
或者您可以在应用模板的节点上更有选择性:
您采用的方法部分取决于品味,部分取决于样式表和预期数据的更广泛上下文(输入结构可能变化多少等) 。
我知道我远远超出了你的要求,但我希望你至少觉得这很有趣。 :-)
You can take advantage of XPath's implicit existential quantification on the
=
operator:When you use any of the six comparison operators (
=
,!=
,<
,<=
,>
, and>=
) to compare a node-set, the expression will return true if any node in the node-set satisfies the condition. When comparing one node-set with another, the expression returns true if any node in the first node-set satisfies the condition when compared with any node in the second node-set. XPath 2.0 introduces six new operators that don't perform this existential quantification (eq
,ne
,lt
,le
,gt
, andge
). But in your case, you'll want to use "=
" to get that existential quantification.Note of course, that you'll still want to use the
not()
function as you were doing. Most of the time, it's good to avoid the!=
operator. If you used it here instead ofnot()
, then it would return true if there are any@vertex
attributes that are not equal to the@name
value, which is not your intention. (And if either node-set is empty, then it would return false, as comparisons with empty node-sets always return false.)If you want to use
eq
instead, then you'd have to do something like you did: separate out the conditional from the iteration so you could bindcurrent()
. But in XPath 2.0, you can do this within an expression:This is useful for when your condition isn't a simple equality comparison (and thus can't be existentially quantified using "
=
"). For example:starts-with(@vertex, $v/@name)
.XPath 2.0 also has an explicit way of performing existential quantification. Instead of the
for
expression above, we could have written this:In addition to the "
some
" syntax, XPath 2.0 also supplies a corresponding "every
" syntax for performing universal quantification.Rather than using
for-each
, you could also use template rules, which are more modular (and powerful):Again, in this case, we're relying on the existential quantification of
=
.XSLT 1.0 prohibits use of the
current()
function in patterns, i.e., in thematch
attribute, but XSLT 2.0 allows it. In that case,current()
refers to the node currently being matched. So in XSLT 2.0, we could also write this (without having to use afor
expression):Note that this pattern is essentially the same as the expression you tried to use in
for-each
, but whereas it doesn't do what you want infor-each
, it does do what you want in the pattern (because whatcurrent()
binds to is different).Finally, I'll add one more variation that in some ways simplifies the logic (removing
not()
). This also goes back to using XSLT 1.0:If you don't like the whitespace being output, add an empty rule for text nodes, so they'll get stripped out (overriding the default rule for text nodes, which is to copy them):
Or you could just be more selective in what nodes you apply templates to:
Which approach you take is partially dependent on taste, partially dependent on the wider context of your stylesheet and expected data (how much the input structure might vary, etc.).
I know I went way beyond what you were asking for, but I hope you at least found this interesting. :-)
这样的 XPath 1.0 表达式是:
/*/vertex[not(@name = /*/vertex /directed-edge-to/@vertex)]
然后将其放入 XSLT 样式表中:
当此样式表应用于最初提供的XML 文档:
生成了想要的结果:
请注意:XSLT 中提供了遍历完整(可能是循环)图的解决方案 此处。
One such XPath 1.0 expression is:
/*/vertex[not(@name = /*/vertex/directed-edge-to/@vertex)]
Then just put it into an XSLT stylesheet like that:
When this stylesheet is applied on the originally-provided XML document:
The wanted result is produced:
Do note: A solution for traversing full (maybe cyclic) graphs is available in XSLT here.