如何使用带有样式表和 xsltproc 的 xslt 从 xml 中删除元素？

发布于 2024-07-08 20:21:19 字数 752 浏览 10 评论 0原文

我有很多 XML 文件，它们具有以下形式：

<Element fruit="apple" animal="cat" />

我想将其从文件中删除。

使用 XSLT 样式表和 Linux 命令行实用程序 xsltproc，我该如何做到这一点？

此时在脚本中我已经有了包含我想要删除的元素的文件列表，因此单个文件可以用作参数。

编辑：这个问题最初缺乏意图。

我想要实现的是删除整个元素“Element”（fruit==“apple”&&animal==“cat”）。在同一个文档中有许多名为“Element”的元素，我希望保留这些元素。所以

<Element fruit="orange" animal="dog" />
<Element fruit="apple"  animal="cat" />
<Element fruit="pear"   animal="wild three eyed mongoose of kentucky" />

会变成：

<Element fruit="orange" animal="dog" />
<Element fruit="pear"   animal="wild three eyed mongoose of kentucky" />

原文

I have a lot of XML files which have something of the form:

<Element fruit="apple" animal="cat" />

Which I want to be removed from the file.

Using an XSLT stylesheet and the Linux command-line utility xsltproc, how could I do this?

By this point in the script I already have the list of files containing the element I wish to remove, so the single file can be used as a parameter.

EDIT: the question was originally lacking in intention.

What I am trying to achieve is to remove the entire element "Element" where (fruit=="apple" && animal=="cat"). In the same document there are many elements named "Element", I wish for these to remain. So

<Element fruit="orange" animal="dog" />
<Element fruit="apple"  animal="cat" />
<Element fruit="pear"   animal="wild three eyed mongoose of kentucky" />

Would become:

<Element fruit="orange" animal="dog" />
<Element fruit="pear"   animal="wild three eyed mongoose of kentucky" />

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

活雷疯 2024-07-15 20:21:19

使用最基本的 XSLT 设计模式之一：“覆盖 标识转换”，只需编写以下内容：

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:output omit-xml-declaration="yes"/>

    <xsl:template match="node()|@*">
      <xsl:copy>
         <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
    </xsl:template>

    <xsl:template match="Element[@fruit='apple' and @animal='cat']"/>
</xsl:stylesheet>

请注意第二个模板如何仅针对具有属性“fruit”的名为“Element”的元素覆盖身份（第一个）模板”，值为“apple”，属性“animal”，值为“cat”。该模板的主体为空，这意味着匹配的元素将被简单地忽略（匹配时不会产生任何内容）。

当此转换应用于以下源 XML 文档时：

<doc>... 
    <Element name="same">foo</Element>...
    <Element fruit="apple" animal="cat" />
    <Element fruit="pear" animal="cat" />
    <Element name="same">baz</Element>...
    <Element name="same">foobar</Element>...
</doc>

产生想要的结果：

<doc>... 
    <Element name="same">foo</Element>...
    <Element fruit="pear" animal="cat"/>
    <Element name="same">baz</Element>...
    <Element name="same">foobar</Element>...
</doc>

可以找到使用和覆盖身份模板的更多代码片段此处。

Using one of the most fundamental XSLT design patterns: "Overriding the identity transformation" one will just write the following:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:output omit-xml-declaration="yes"/>

    <xsl:template match="node()|@*">
      <xsl:copy>
         <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
    </xsl:template>

    <xsl:template match="Element[@fruit='apple' and @animal='cat']"/>
</xsl:stylesheet>

Do note how the second template overrides the identity (1st) template only for elements named "Element" that have an attribute "fruit" with value "apple" and attribute "animal" with value "cat". This template has empty body, which means that the matched element is simply ignored (nothing is produced when it is matched).

When this transformation is applied on the following source XML document:

<doc>... 
    <Element name="same">foo</Element>...
    <Element fruit="apple" animal="cat" />
    <Element fruit="pear" animal="cat" />
    <Element name="same">baz</Element>...
    <Element name="same">foobar</Element>...
</doc>

the wanted result is produced:

<doc>... 
    <Element name="same">foo</Element>...
    <Element fruit="pear" animal="cat"/>
    <Element name="same">baz</Element>...
    <Element name="same">foobar</Element>...
</doc>

More code snippets of using and overriding the identity template can be found here.

回复收藏 0 原文

撕心裂肺的伤痛 2024-07-15 20:21:19

@Dimitre Novatchev 的答案当然既正确又优雅，但有一个概括（OP 没有询问）：如果您要过滤的元素也有您想要保留的子元素或文本怎么办？

我相信这个微小的变化涵盖了这种情况：

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    version="2.0">

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <!-- drop DropMe elements, keeping child text and elements -->
    <xsl:template match="DropMe">
        <xsl:apply-templates/>
    </xsl:template>

</xsl:stylesheet>

匹配条件可能会很复杂以指定其他属性等，并且如果您要删除其他内容，您可以使用多个此类模板。

所以这个输入：

<?xml version="1.0" encoding="UTF-8"?>
<mydocument>
    <p>Here's text to keep</p>
    <p><DropMe>Keep this text but not the element</DropMe>; and keep what follows.</p>
    <p><DropMe>Also keep this text and <b>this child element</b> too</DropMe>, along with what follows.</p>
</mydocument>

产生这个输出：

<?xml version="1.0" encoding="UTF-8"?><mydocument>
    <p>Here's text to keep</p>
    <p>Keep this text but not the element; and keep what follows.</p>
    <p>Also keep this text and <b>this child element</b> too, along with what follows.</p>
</mydocument>

归功于 XSLT Cookbook。

The answer by @Dimitre Novatchev is certainly both correct and elegant, but there's a generalization (that the OP didn't ask about): what if the element you want to filter also has child elements or text that you want to keep?

I believe this minor variation covers that case:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    version="2.0">

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <!-- drop DropMe elements, keeping child text and elements -->
    <xsl:template match="DropMe">
        <xsl:apply-templates/>
    </xsl:template>

</xsl:stylesheet>

The match condition can be complicated to specify other attributes, etc., and you can use multiple such templates if you're dropping other things.

So this input:

<?xml version="1.0" encoding="UTF-8"?>
<mydocument>
    <p>Here's text to keep</p>
    <p><DropMe>Keep this text but not the element</DropMe>; and keep what follows.</p>
    <p><DropMe>Also keep this text and <b>this child element</b> too</DropMe>, along with what follows.</p>
</mydocument>

produces this output:

<?xml version="1.0" encoding="UTF-8"?><mydocument>
    <p>Here's text to keep</p>
    <p>Keep this text but not the element; and keep what follows.</p>
    <p>Also keep this text and <b>this child element</b> too, along with what follows.</p>
</mydocument>

Credit to XSLT Cookbook.

回复收藏 0 原文

~没有更多了~