删除重复的 xml 标头
由于某种原因,html Tidy 将此作为输出:
<?xml version="1.0" encoding="utf-16"?>
<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for Linux/x86 (vers 11 February 2007), see www.w3.org" />
<meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5" />
...rest of document
因此有 2 个 xml 标头,并且类型错误(不是 UTF-8)。 有没有办法删除第二个标头,将其更改为 UTF-8,并使用 XSL 删除 DOCTYPE?
html Tidy gives this as output for some reason:
<?xml version="1.0" encoding="utf-16"?>
<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for Linux/x86 (vers 11 February 2007), see www.w3.org" />
<meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5" />
...rest of document
So there are 2 xml headers, and of the wrong type (not UTF-8).
Is there a way to remove the 2nd header, change it to UTF-8, and also remove the DOCTYPE with XSL?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为解决原来的问题会更好。 您使用 HTML Tidy 库吗?
尝试将 output-encoding 设置为 utf8 和 add-xml-decl 为 false。 可以通过将 doctype 属性设置为省略来抑制 DOCTYPE 节点。
I think that it would be better to fix the original problem. Do you use the HTML Tidy library?
Try setting output-encoding to utf8 and add-xml-decl to false. The DOCTYPE node can be suppressed by setting the doctype property to omit.
是的。 创建一个与您想要接受的第一个子元素匹配的模板,然后让它只输出该元素的内容。
Yes. Create a template that matches the first child element you want to accept and then have it just output the content of that element.