使用 XSLT 取消 XML 结果的实体化

发布于 2024-12-21 15:03:04 字数 5017 浏览 0 评论 0原文

我陷入了两难的境地。在特定应用程序中,我从 SOAP 请求接收 XML 结果,如下所示:

<env:Envelope xmlns:env='http://schemas.xmlsoap.org/soap/envelope/'>
  <env:Header />
  <env:Body>
    <ns1:searchResponse xmlns:ns1='http://url.to.namespace' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
      <ns1:result>&lt;?xml version="1.0"?&gt;&lt;results count="201" returned="201" code="200" msg="successful"&gt;&lt;result order="0"&gt;&lt;dirkey&gt;DK886shn3525&lt;/dirkey&gt;&lt;eid&gt;smith&lt;/eid&gt;&lt;email&gt;[email protected]&lt;/email&gt;&lt;fn&gt;Smith&lt;/fn&gt;&lt;ln&gt;Bob&lt;/ln&gt;&lt;wid&gt;859589157&lt;/wid&gt;&lt;score&gt;70&lt;/score&gt;&lt;/result&gt;&lt;result order="1"&gt;&lt;dirkey&gt;DK547fjx6702&lt;/dirkey&gt;&lt;eid&gt;james31&lt;/eid&gt;&lt;email&gt;[email protected]&lt;/email&gt;&lt;fn&gt;Tim&lt;/fn&gt;&lt;ln&gt;Allen&lt;/ln&gt;&lt;stu&gt;&lt;lvl&gt;Senior&lt;/lvl&gt;&lt;plans&gt;&lt;plan&gt;Technology Management-B&lt;/plan&gt;&lt;/plans&gt;&lt;contacts&gt;&lt;contact type="permanent"&gt;&lt;city&gt;Salina&lt;/city&gt;&lt;phone&gt;(123) 456-7890&lt;/phone&gt;&lt;postal&gt;67401&lt;/postal&gt;&lt;street1&gt;1111 Main Ln&lt;/street1&gt;&lt;state&gt;KS&lt;/state&gt;&lt;/contact&gt;&lt;/contacts&gt;&lt;/stu&gt;&lt;wid&gt;2222222222&lt;/wid&gt;&lt;score&gt;20&lt;/score&gt;&lt;/result&gt;</ns1:result>
    </ns1:searchResponse>
  </env:Body>
</env:Envelope>

我对 元素中包含的数据最感兴趣。虽然这在 HTML 世界中可能有意义,但我需要 文本作为 XML。出于对通过 XSL 执行此操作的可能性的兴趣,我构建了以下样式表:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:ns1="http://url.to.namespace"
  exclude-result-prefixes="env ns1">

  <xsl:output omit-xml-declaration="yes" indent="yes" method="text" />
  <xsl:strip-space elements="*"/>

  <!-- Template #1 - Identity Transform -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <!-- Template #2 - for all text() nodes, disable output escaping -->
  <xsl:template match="text()">
    <xsl:copy-of select="." disable-output-escaping="yes" />
  </xsl:template>

</xsl:stylesheet>

...从技术上讲,它确实可以生成我想要的内容:

<?xml version="1.0"?>
<results count="201" returned="201" code="200" msg="successful">
  <result order="0">
    <dirkey>DK886shn3525</dirkey>
    <eid>smith</eid>
    <email>[email protected]</email>
    <fn>Bob</fn>
    <ln>Smith</ln>
    <wid>859589157</wid>
    <score>70</score>
  </result>
  <result order="1">
    <dirkey>DK547fjx6702</dirkey>
    <eid>ta</eid>
    <email>[email protected]</email>
    <fn>Tim</fn>
    <ln>Allen</ln>
    <stu>
      <lvl>Senior</lvl>
      <plans>
        <plan>Technology Management-B</plan>
      </plans>
      <contacts>
        <contact type="permanent">
          <city>Salina</city>
          <phone>(123) 456-7890</phone>
          <postal>67401</postal>
          <street1>1111 Main Ln</street1>
          <state>KS</state>
        </contact>
      </contacts>
    </stu>
    <wid>2222222222</wid>
    <score>20</score>
  </result>
</results>

但是,我听说 DOE 是绝望的标志个人。事实上,当我尝试通过我们的应用程序(旨在在将 XML 传递到模板引擎之前转换 XML 的应用程序)运行此 XSLT 时,它不起作用。我猜测 DOE 没有在我们特定的 XSL 解析器中实现...

所以,这是最终的问题:XSLT 1.0 中是否有一种方法可以在不使用像 DOE 这样的特定于解析器的策略的情况下转义这些实体?我的一个想法是构建一种方法,将某些转义字符(例如,&gt;)转换为它们的文字对应物(>)......但我不是完全确定我会如何去做。

一如既往,我感谢您的帮助。

PS 请不要费心告诉我这个输出是多么恶心或者他们如何破坏了他们的文档结构;我们已经尝试让他们改变它,但这不是一个选择。 :(

I have run into a dilemma. In a particular application, I'm receiving XML results from a SOAP request that look like this:

<env:Envelope xmlns:env='http://schemas.xmlsoap.org/soap/envelope/'>
  <env:Header />
  <env:Body>
    <ns1:searchResponse xmlns:ns1='http://url.to.namespace' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
      <ns1:result><?xml version="1.0"?><results count="201" returned="201" code="200" msg="successful"><result order="0"><dirkey>DK886shn3525</dirkey><eid>smith</eid><email>[email protected]</email><fn>Smith</fn><ln>Bob</ln><wid>859589157</wid><score>70</score></result><result order="1"><dirkey>DK547fjx6702</dirkey><eid>james31</eid><email>[email protected]</email><fn>Tim</fn><ln>Allen</ln><stu><lvl>Senior</lvl><plans><plan>Technology Management-B</plan></plans><contacts><contact type="permanent"><city>Salina</city><phone>(123) 456-7890</phone><postal>67401</postal><street1>1111 Main Ln</street1><state>KS</state></contact></contacts></stu><wid>2222222222</wid><score>20</score></result></ns1:result>
    </ns1:searchResponse>
  </env:Body>
</env:Envelope>

I am most interested in the data contained within the <ns1:result> element. While this might make sense in an HTML world, I need the <ns1:result> text as XML. Intrigued by the possibility of doing this via XSL, I constructed the following stylesheet:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:ns1="http://url.to.namespace"
  exclude-result-prefixes="env ns1">

  <xsl:output omit-xml-declaration="yes" indent="yes" method="text" />
  <xsl:strip-space elements="*"/>

  <!-- Template #1 - Identity Transform -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <!-- Template #2 - for all text() nodes, disable output escaping -->
  <xsl:template match="text()">
    <xsl:copy-of select="." disable-output-escaping="yes" />
  </xsl:template>

</xsl:stylesheet>

...which technically does produce what I want:

<?xml version="1.0"?>
<results count="201" returned="201" code="200" msg="successful">
  <result order="0">
    <dirkey>DK886shn3525</dirkey>
    <eid>smith</eid>
    <email>[email protected]</email>
    <fn>Bob</fn>
    <ln>Smith</ln>
    <wid>859589157</wid>
    <score>70</score>
  </result>
  <result order="1">
    <dirkey>DK547fjx6702</dirkey>
    <eid>ta</eid>
    <email>[email protected]</email>
    <fn>Tim</fn>
    <ln>Allen</ln>
    <stu>
      <lvl>Senior</lvl>
      <plans>
        <plan>Technology Management-B</plan>
      </plans>
      <contacts>
        <contact type="permanent">
          <city>Salina</city>
          <phone>(123) 456-7890</phone>
          <postal>67401</postal>
          <street1>1111 Main Ln</street1>
          <state>KS</state>
        </contact>
      </contacts>
    </stu>
    <wid>2222222222</wid>
    <score>20</score>
  </result>
</results>

However, I've heard it said that DOE is the sign of a desperate individual. Indeed, when I try to run this XSLT through an application of ours (one that is designed to transform XML before passing it on to a templating engine), it doesn't work. I'm guessing that DOE is not implemented in our particular XSL parser...

So, here's the ultimate question: is there a way in XSLT 1.0 to unescape these entities without using a parser-specific tactic like DOE? My one thought is constructing a method that translates certain escaped characters (e.g., >) into their literal counterparts (>)...but I'm not entirely sure how I'd go about that.

As always, I appreciate your assistance.

P.S. Please, don't bother telling me how disgusting this output is or how they've mangled their document structure; we've already tried to get them to change it and that's not an option. :(

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

满天都是小星星 2024-12-28 15:03:04

所以,这是最终的问题:XSLT 1.0 是否有办法
不使用特定于解析器的策略来转义这些实体,例如
美国能源部?我的一个想法是构建一种方法来翻译某些
将字符(例如,>)转义为其文字对应项
(>)...但我不完全确定我会如何去做。

没有一种纯粹的 XSLT 方法来重建被破坏的标记 - 直到 XSLT 3.0(仍然是 W3C 工作草案)将 xave 标准函数parse-xml()

直到你有XSLT 3.0 可用,重建被破坏的标记的安全方法是调用具有类似签名的扩展函数,您必须自己编写该扩展函数。

此扩展函数将尝试将其字符串参数解析为 XmlDocument 的实例,如果成功,则返回结果。

So, here's the ultimate question: is there a way in XSLT 1.0 to
unescape these entities without using a parser-specific tactic like
DOE? My one thought is constructing a method that translates certain
escaped characters (e.g., >) into their literal counterparts
(>)...but I'm not entirely sure how I'd go about that.

There isn't a pure XSLT way to reconstruct destroyed markup -- until XSLT 3.0 (still a W3C working draft) that will xave a standard function parse-xml()

Until you have XSLT 3.0 available, the safe way to reconstruct destroyed markup is to call an extension function with a similar signature that you have to write yourself.

This extension function will try to parse its string argument into an instance of XmlDocument and if successful, return back the result.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文