xslt unescape 两次(例如 & 变为 &)

发布于 2024-09-14 10:32:09 字数 2573 浏览 7 评论 0原文

我正在尝试转换一些由 Twitter 搜索 api 返回的 xml。看起来 content 元素包含转义两次的文本(Inception 样式)。当我在 XSL 样式表中使用以下内容时,它仅取消转义一次:

<xsl:value-of select="atom:content" disable-output-escaping="yes" />

如何执行第二轮取消转义?谢谢!

输入文档示例:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns:google="http://base.google.com/ns/1.0" xml:lang="en-US" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http://www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/">
  <id>tag:search.twitter.com,2005:search/from:myusername</id>
  <link type="text/html" href="http://search.twitter.com/search?q=from%3Amyusername" rel="alternate"/>
  <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername" rel="self"/>
  <title>from:myusername - Twitter Search</title>
  <link type="application/opensearchdescription+xml" href="http://search.twitter.com/opensearch.xml" rel="search"/>
  <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername&amp;since_id=21346924004" rel="refresh"/>
  <updated>2010-08-16T21:38:42Z</updated>
  <openSearch:itemsPerPage>15</openSearch:itemsPerPage>
  <entry>
    <id>tag:search.twitter.com,2005:21346924004</id>
    <published>2010-08-16T21:38:42Z</published>
    <link type="text/html" href="http://twitter.com/myusername/statuses/21346924004" rel="alternate"/>
    <title>testing special chars for a custom twitter client &lt; &gt; &amp; ' &#163; &#8364;</title>
    <content type="html">testing special chars for a custom twitter client &amp;lt; &amp;gt; &amp;amp; &amp;apos; &#163; &#8364;</content>
    <updated>2010-08-16T21:38:42Z</updated>
    <link type="image/png" href="http://a1.twimg.com/profile_images/820967365/twitter_avatar_normal.jpg" rel="image"/>
    <twitter:geo>
    </twitter:geo>
    <twitter:metadata>
      <twitter:result_type>recent</twitter:result_type>
    </twitter:metadata>
    <twitter:source>&lt;a href=&quot;http://twitter.com/&quot;&gt;web&lt;/a&gt;</twitter:source>
    <twitter:lang>en</twitter:lang>
    <author>
      <name>myusername</name>
      <uri>http://twitter.com/myusername</uri>
    </author>
  </entry>
</feed>

I am trying to transform some xml, which was returned by the Twitter Search api. It looks like the content element contains text that is escaped twice, Inception style. When I use the following in my XSL stylesheet it only unescapes it once:

<xsl:value-of select="atom:content" disable-output-escaping="yes" />

How do I perform the second round of unescaping? Thanks!

Example input document:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns:google="http://base.google.com/ns/1.0" xml:lang="en-US" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http://www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/">
  <id>tag:search.twitter.com,2005:search/from:myusername</id>
  <link type="text/html" href="http://search.twitter.com/search?q=from%3Amyusername" rel="alternate"/>
  <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername" rel="self"/>
  <title>from:myusername - Twitter Search</title>
  <link type="application/opensearchdescription+xml" href="http://search.twitter.com/opensearch.xml" rel="search"/>
  <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername&since_id=21346924004" rel="refresh"/>
  <updated>2010-08-16T21:38:42Z</updated>
  <openSearch:itemsPerPage>15</openSearch:itemsPerPage>
  <entry>
    <id>tag:search.twitter.com,2005:21346924004</id>
    <published>2010-08-16T21:38:42Z</published>
    <link type="text/html" href="http://twitter.com/myusername/statuses/21346924004" rel="alternate"/>
    <title>testing special chars for a custom twitter client < > & ' £ €</title>
    <content type="html">testing special chars for a custom twitter client &lt; &gt; &amp; &apos; £ €</content>
    <updated>2010-08-16T21:38:42Z</updated>
    <link type="image/png" href="http://a1.twimg.com/profile_images/820967365/twitter_avatar_normal.jpg" rel="image"/>
    <twitter:geo>
    </twitter:geo>
    <twitter:metadata>
      <twitter:result_type>recent</twitter:result_type>
    </twitter:metadata>
    <twitter:source><a href="http://twitter.com/">web</a></twitter:source>
    <twitter:lang>en</twitter:lang>
    <author>
      <name>myusername</name>
      <uri>http://twitter.com/myusername</uri>
    </author>
  </entry>
</feed>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

执笔绘流年 2024-09-21 10:32:09

此样式表:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:char="character" xmlns:atom="http://www.w3.org/2005/Atom">
    <char:char ent="lt"><</char:char>
    <char:char ent="gt">></char:char>
    <char:char ent="amp">&</char:char>
    <char:char ent="apos">'</char:char>
    <char:char ent="quot">"</char:char>
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="atom:content/text()" name="replace">
        <xsl:param name="pText" select="."/>
        <xsl:choose>
            <xsl:when test="contains($pText,'&')">
                <xsl:variable name="vAfter" select="substring-after($pText,'&')"/>
                <xsl:value-of select="concat(substring-before($pText,'&'),
                                             document('')/*/char:*
                                             [@ent =
                                             substring-before($vAfter,';')])"
                                                       disable-output-escaping="yes"/>
                <xsl:call-template name="replace">
                    <xsl:with-param name="pText" select="substring-after($vAfter,';')"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$pText"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

输出:

<feed xml:lang="en-US" xmlns:google="http://base.google.com/ns/1.0" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http://www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/">
    <id>tag:search.twitter.com,2005:search/from:myusername</id>
    <link type="text/html" href="http://search.twitter.com/search?q=from%3Amyusername" rel="alternate"></link>
    <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername" rel="self"></link>
    <title>from:myusername - Twitter Search</title>
    <link type="application/opensearchdescription+xml" href="http://search.twitter.com/opensearch.xml" rel="search"></link>
    <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername&since_id=21346924004" rel="refresh"></link>
    <updated>2010-08-16T21:38:42Z</updated>
    <openSearch:itemsPerPage>15</openSearch:itemsPerPage>
    <entry>
        <id>tag:search.twitter.com,2005:21346924004</id>
        <published>2010-08-16T21:38:42Z</published>
        <link type="text/html" href="http://twitter.com/myusername/statuses/21346924004" rel="alternate"></link>
        <title>testing special chars for a custom twitter client < > & ' £ €</title>
        <content type="html">testing special chars for a custom twitter client < > & ' £ €</content>
        <updated>2010-08-16T21:38:42Z</updated>
        <link type="image/png" href="http://a1.twimg.com/profile_images/820967365/twitter_avatar_normal.jpg" rel="image"></link>
        <twitter:geo></twitter:geo>
        <twitter:metadata>
            <twitter:result_type>recent</twitter:result_type>
        </twitter:metadata>
        <twitter:source><a href="http://twitter.com/">web</a></twitter:source>
        <twitter:lang>en</twitter:lang>
        <author>
            <name>myusername</name>
            <uri>http://twitter.com/myusername</uri>
        </author>
    </entry>
</feed>

注意atom:content 的文本节点现在已转义,但这格式不正确

编辑:以防万一你需要一个格式良好的输出,你可以添加这个输出声明:

<xsl:output cdata-section-elements="atom:content"/>

然后你可以去掉 disable-output-escaping="yes",所以你的输出将是:

<feed xml:lang="en-US" xmlns:google="http://base.google.com/ns/1.0" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http://www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/">
    <id>tag:search.twitter.com,2005:search/from:myusername</id>
    <link type="text/html" href="http://search.twitter.com/search?q=from%3Amyusername" rel="alternate"></link>
    <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername" rel="self"></link>
    <title>from:myusername - Twitter Search</title>
    <link type="application/opensearchdescription+xml" href="http://search.twitter.com/opensearch.xml" rel="search"></link>
    <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername&since_id=21346924004" rel="refresh"></link>
    <updated>2010-08-16T21:38:42Z</updated>
    <openSearch:itemsPerPage>15</openSearch:itemsPerPage>
    <entry>
        <id>tag:search.twitter.com,2005:21346924004</id>
        <published>2010-08-16T21:38:42Z</published>
        <link type="text/html" href="http://twitter.com/myusername/statuses/21346924004" rel="alternate"></link>
        <title>testing special chars for a custom twitter client < > & ' £ €</title>
        <content type="html"><![CDATA[testing special chars for a custom twitter client < > & ' £ €]]></content>
        <updated>2010-08-16T21:38:42Z</updated>
        <link type="image/png" href="http://a1.twimg.com/profile_images/820967365/twitter_avatar_normal.jpg" rel="image"></link>
        <twitter:geo></twitter:geo>
        <twitter:metadata>
            <twitter:result_type>recent</twitter:result_type>
        </twitter:metadata>
        <twitter:source><a href="http://twitter.com/">web</a></twitter:source>
        <twitter:lang>en</twitter:lang>
        <author>
            <name>myusername</name>
            <uri>http://twitter.com/myusername</uri>
        </author>
    </entry>
</feed>

注意 :CDATA 部分上没有转义操作。

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:char="character" xmlns:atom="http://www.w3.org/2005/Atom">
    <char:char ent="lt"><</char:char>
    <char:char ent="gt">></char:char>
    <char:char ent="amp">&</char:char>
    <char:char ent="apos">'</char:char>
    <char:char ent="quot">"</char:char>
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="atom:content/text()" name="replace">
        <xsl:param name="pText" select="."/>
        <xsl:choose>
            <xsl:when test="contains($pText,'&')">
                <xsl:variable name="vAfter" select="substring-after($pText,'&')"/>
                <xsl:value-of select="concat(substring-before($pText,'&'),
                                             document('')/*/char:*
                                             [@ent =
                                             substring-before($vAfter,';')])"
                                                       disable-output-escaping="yes"/>
                <xsl:call-template name="replace">
                    <xsl:with-param name="pText" select="substring-after($vAfter,';')"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$pText"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

Output:

<feed xml:lang="en-US" xmlns:google="http://base.google.com/ns/1.0" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http://www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/">
    <id>tag:search.twitter.com,2005:search/from:myusername</id>
    <link type="text/html" href="http://search.twitter.com/search?q=from%3Amyusername" rel="alternate"></link>
    <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername" rel="self"></link>
    <title>from:myusername - Twitter Search</title>
    <link type="application/opensearchdescription+xml" href="http://search.twitter.com/opensearch.xml" rel="search"></link>
    <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername&since_id=21346924004" rel="refresh"></link>
    <updated>2010-08-16T21:38:42Z</updated>
    <openSearch:itemsPerPage>15</openSearch:itemsPerPage>
    <entry>
        <id>tag:search.twitter.com,2005:21346924004</id>
        <published>2010-08-16T21:38:42Z</published>
        <link type="text/html" href="http://twitter.com/myusername/statuses/21346924004" rel="alternate"></link>
        <title>testing special chars for a custom twitter client < > & ' £ €</title>
        <content type="html">testing special chars for a custom twitter client < > & ' £ €</content>
        <updated>2010-08-16T21:38:42Z</updated>
        <link type="image/png" href="http://a1.twimg.com/profile_images/820967365/twitter_avatar_normal.jpg" rel="image"></link>
        <twitter:geo></twitter:geo>
        <twitter:metadata>
            <twitter:result_type>recent</twitter:result_type>
        </twitter:metadata>
        <twitter:source><a href="http://twitter.com/">web</a></twitter:source>
        <twitter:lang>en</twitter:lang>
        <author>
            <name>myusername</name>
            <uri>http://twitter.com/myusername</uri>
        </author>
    </entry>
</feed>

Note: The text node of atom:content is now unescape, but this is not well formed

Edit: Just in case you need a well formed output, you could add this output declaration:

<xsl:output cdata-section-elements="atom:content"/>

Then you could strip the disable-output-escaping="yes", so your output will be:

<feed xml:lang="en-US" xmlns:google="http://base.google.com/ns/1.0" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http://www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/">
    <id>tag:search.twitter.com,2005:search/from:myusername</id>
    <link type="text/html" href="http://search.twitter.com/search?q=from%3Amyusername" rel="alternate"></link>
    <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername" rel="self"></link>
    <title>from:myusername - Twitter Search</title>
    <link type="application/opensearchdescription+xml" href="http://search.twitter.com/opensearch.xml" rel="search"></link>
    <link type="application/atom+xml" href="http://search.twitter.com/search.atom?q=from%3Amyusername&since_id=21346924004" rel="refresh"></link>
    <updated>2010-08-16T21:38:42Z</updated>
    <openSearch:itemsPerPage>15</openSearch:itemsPerPage>
    <entry>
        <id>tag:search.twitter.com,2005:21346924004</id>
        <published>2010-08-16T21:38:42Z</published>
        <link type="text/html" href="http://twitter.com/myusername/statuses/21346924004" rel="alternate"></link>
        <title>testing special chars for a custom twitter client < > & ' £ €</title>
        <content type="html"><![CDATA[testing special chars for a custom twitter client < > & ' £ €]]></content>
        <updated>2010-08-16T21:38:42Z</updated>
        <link type="image/png" href="http://a1.twimg.com/profile_images/820967365/twitter_avatar_normal.jpg" rel="image"></link>
        <twitter:geo></twitter:geo>
        <twitter:metadata>
            <twitter:result_type>recent</twitter:result_type>
        </twitter:metadata>
        <twitter:source><a href="http://twitter.com/">web</a></twitter:source>
        <twitter:lang>en</twitter:lang>
        <author>
            <name>myusername</name>
            <uri>http://twitter.com/myusername</uri>
        </author>
    </entry>
</feed>

Note: There is no escape perform on CDATA sections.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文