在 xslt 期间保留字符实体

发布于 2024-12-11 11:43:29 字数 602 浏览 0 评论 0原文

例如,我一直在源 xml 中使用度数字符实体 ° ,并且在翻译和工作后它始终输出为 °美好的。然而,我最近不得不从 xalan 处理器切换到 saxon,现在该字符在 html 中作为实际度数字符 (°) 输出,并且浏览器将其渲染为 ←∞。

我不太确定为什么它在 xalan 中起作用,但我四处搜索并认为字符映射将是我在此页面中找到的解决方案:

http://www.xmlplease.com/xmltraining/xslt-by-example/examples/character-map_1.html

但是当我做同样的事情时,它似乎被忽略了,我仍然看到 ←∞。

同样,我在 ant 和 java6 中使用 saxon9 和 xslt 任务。我希望在转换为 html 时保留 xml 中的 ° 字符(或更改为 °)。有什么建议吗?

I've been using, for example, the degree character entity ° in my source xml and it was always output as ° after translation and worked fine. However, I've recently had to switch from a xalan processor to saxon and now the character is being output as an actual degree character (°) in the html and the browser is rendering it as ¬∞.

I'm not really sure why it worked in xalan but I was searching around and thought character maps would be the solution from what I found in this page:

http://www.xmlplease.com/xmltraining/xslt-by-example/examples/character-map_1.html

But when I do the same thing it just appears to be ignored and I still see the °.

Again, I'm using saxon9 with the xslt task in ant with java6. I'd like my ° character in xml be preserved (or changed to °) when translating to html. Any suggestion?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

音盲 2024-12-18 11:43:29

看起来新输出没有标记为 UTF-8?

大多数情况下,当一个字符变成两个时,这是因为您将 UTF-8 发送到浏览器,说它是另一种编码(即 ISO-8859-1、win-1512 等)。将 UTF-8 编码放入 HTML 标头可能还不够。您可能还需要将其作为 HTTP 回复中的标头。

如果 XSLT 解析器转换所有实体,那么使用 ° 应该没有帮助。

否则,您可能可以设置一个标志来避免实体的翻译?

It looks like the new output is not marked as UTF-8?

Most often, when one character becomes two, it's because you send UTF-8 to the browser saying it's another encoding (i.e. ISO-8859-1, win-1512, etc.). Putting UTF-8 encoding in the HTML header may not be enough. You probably also need to put it as a header in the HTTP reply.

Using ° should not help if the XSLT parser transforms all the entities.

Otherwise, there may be a flag you can set to avoid the translation of entities?

温柔戏命师 2024-12-18 11:43:29

您无法强制保留输入实体,但可以通过使用输出编码=“us-ascii”确保将任何非 ASCII 字符输出为实体或字符引用。

您的浏览器无法正确显示度数符号这一事实意味着文档使用了错误的编码。使用 us-ascii 是解决此问题的方法,但它并不能解决根本问题,即您的配置中某个地方出了问题(很难找出哪里)。

我不知道为什么你的字符映射被忽略。假设您已正确编码,最可能的原因是序列化不是由 XSLT 处理器完成,而是由其他东西完成:例如,您可能要转换为 DOM,然后序列化 DOM。

您可以使用 saxon:character-representation 更好地控制 Saxon 如何通过 HTML 输出方法序列化特殊字符 - 请参阅

You can't force the input entities to be preserved, but you can ensure that any non-ASCII characters are output as entity or character references by using output encoding="us-ascii".

The fact that your browser doesn't display the degree sign correctly means that the document is being served with the wrong encoding. Using us-ascii is a workaround for this, but it doesn't solve the underlying problem which is that there's something wrong in your configuration somewhere (it can be hard to find out where).

I don't know why your character maps are ignored. Assuming you've coded it correctly, the most likely reason is that the serialisation isn't being done by the XSLT processor but by something else: for example, you might be transforming to a DOM and then serialising the DOM.

You can get more control over how Saxon serialises special characters with the HTML output method using saxon:character-representation - see http://saxonica.com/documentation/extensions/output-extras/character-representation.xml

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文