如何防止 XSLT 在 HTML 输出中引入空格
我正在使用 XSLT 从 XML 源生成 HTML。 HTML 显示了原始 XML 文件中没有的大量空白。通常这不是问题,因为浏览器会忽略多余的空白字符。但我正在开发一个应用程序,该应用程序依赖于 HTML 页面内文本光标的正确定位。添加的空格确实会扰乱偏移量,从而无法将光标可靠地定位在元素内。
我的问题:如何让我的 XSLT 不在文本节点中引入任何额外的空格?我正在使用
[编辑]
我创建了一个简单的示例,显示了相同的奇怪行为。首先是 XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<p>This is a long sentence. Trying to reproduce a whitespace handling problem with XSLT. This manual describes the spacecraft, safety aspects, usage and maintenance procedures. Make sure the manual is available to anyone who will be using the product.</p>
</root>
这是简化的 XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="1.0">
<xsl:output method="html" encoding="UTF-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="root">
<xsl:text disable-output-escaping="yes"><!DOCTYPE html>
</xsl:text>
<html>
<head>
<title>Test</title>
</head>
<body>
<xsl:apply-templates select="*"/>
<script src="cursor.js"></script>
</body>
</html>
</xsl:template>
<xsl:template match="p">
<p contenteditable="true" id="p1" onclick="show_position()">
<xsl:value-of select="."/>
</p>
</xsl:template>
</xsl:stylesheet>
显示当前光标位置的 JavaScript 文件:
function show_position( )
{
alert('position: ' + document.getSelection().anchorOffset );
}
XSLT 生成的 HTML 如下所示(在 oXygen 中显示):
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Test</title>
</head>
<body>
<p contenteditable="true" id="p1" onclick="show_position()">This is a long sentence. Trying to reproduce a whitespace handling problem with XSLT.
This manual describes the spacecraft, safety aspects, usage and maintenance procedures.
Make sure the manual is available to anyone who will be using the product.</p><script src="cursor.js"></script></body>
</html>
在浏览器中查看 HTML 会使所有额外的空格折叠成正如预期的那样,一个空间。在段落内部单击会显示距段落开头的当前偏移量。单击“本手册”前面紧邻的位置将显示位置 86。单击右侧的一个字符将显示位置 96。在以“确保”开头的句子中也会引入相同的额外空格。
我尝试使用 Chrome 和 Safari - 两者都显示相同的结果。这似乎不是浏览器问题,而是 XSLT 处理器生成 HTML 的问题。我尝试过其他 Saxon 版本,但生成的 HTML 始终相同。
任何有关如何防止 HTML 输出中出现这些额外空白字符的进一步信息将不胜感激。
I am generating HTML from XML sources using XSLT. The HTML shows a lot of whitespace that was not in the original XML files. Normally this is not a problem as the browser will ignore the extra whitespace characters. But I am developing an application that relies on correct positioning of the text cursor inside the HTML page. The added whitespaces do mess up the offsets, making it impossible to reliably position the cursor inside an element.
My question: how can I get my XSLT to not introduce any additional whitespaces in text nodes? I am using <xsl:strip-space elements="*"/> but that does not keep the processor from introducing lots of whitespace. It looks like some pretty-printing processing is applied to the HTML and I have no idea where this comes from. I am currently using Saxon PE 9.9.1.7
[Edit]
I created a simple example that shows the same strange behaviour. First the XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<p>This is a long sentence. Trying to reproduce a whitespace handling problem with XSLT. This manual describes the spacecraft, safety aspects, usage and maintenance procedures. Make sure the manual is available to anyone who will be using the product.</p>
</root>
Here is the simplified XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="1.0">
<xsl:output method="html" encoding="UTF-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="root">
<xsl:text disable-output-escaping="yes"><!DOCTYPE html>
</xsl:text>
<html>
<head>
<title>Test</title>
</head>
<body>
<xsl:apply-templates select="*"/>
<script src="cursor.js"></script>
</body>
</html>
</xsl:template>
<xsl:template match="p">
<p contenteditable="true" id="p1" onclick="show_position()">
<xsl:value-of select="."/>
</p>
</xsl:template>
</xsl:stylesheet>
The JavaScript file to show the current cursor position:
function show_position( )
{
alert('position: ' + document.getSelection().anchorOffset );
}
The HTML that is generated by the XSLT looks like this (shown in oXygen):
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Test</title>
</head>
<body>
<p contenteditable="true" id="p1" onclick="show_position()">This is a long sentence. Trying to reproduce a whitespace handling problem with XSLT.
This manual describes the spacecraft, safety aspects, usage and maintenance procedures.
Make sure the manual is available to anyone who will be using the product.</p><script src="cursor.js"></script></body>
</html>
Viewing the HTML in a browser makes all the extra whitespaces collapse into a single space, as expected. Clicking inside the paragraph shows the current offset from the start of the paragraph. Clicking immediately before 'This manual' shows position 86. Clicking one character to the right shows position 96. The same extra whitespace is introduced in the sentence starting with 'Make sure'.
I tried with Chrome and Safari - both show identical results. It does not seem to be a browser problem, but an issue with HTML generation by the XSLT processor. I have tried other Saxon versions but the resulting HTML is always the same.
Any further info on how to prevent these extra whitespace characters in my HTML output would be highly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为
output method="html"
的默认值是indent="yes"
,所以你当然可以显式设置indent="no"
> 在您的xsl:output
声明中。此外,正如您所说,您使用 Saxon PE 9.9,您可以访问 XSLT 3 功能,例如
suppress-indentation="p"
和/或 Saxon PE/EE 特定设置,以使用非常高的设置正常行长度,请检查文档以了解例如saxon:line-length
或类似内容。The default for
output method="html"
isindent="yes"
, I think, so you could certainly explicitly setindent="no"
on yourxsl:output
declaration.Additionally, as you say you use Saxon PE 9.9, you have access to XSLT 3 features like
suppress-indentation="p"
and/or Saxon PE/EE specific settings to use a very high setting for the normal line length, check the documentation for e.g.saxon:line-length
or similar.