如何将 org.w3c.dom.Document 转换为 org.jdom.Document
我需要将 org.w3c.dom.Document 转换为 org.jdom.Document
我已尝试以下操作..
InputStream inputStream = new ByteArrayInputStream(str.getBytes());
Tidy tidy = new Tidy();
tidy.setMakeClean(false);
tidy.setShowWarnings(true); //tidy.setShowWarnings(false);
tidy.setTidyMark(false);
tidy.setNumEntities(true);
tidy.setQuoteAmpersand(true);
tidy.setQuoteMarks(true);
tidy.setQuoteNbsp(false);
tidy.setHideEndTags(false);
tidy.setDropEmptyParas(false);
Document tidyDOM =tidy.parseDOM(inputStream, null);
DOMBuilder domBuilder = new DOMBuilder();
org.jdom.Document jdomDoc = domBuilder.build(tidyDOM);
domBuilder.build(tidyDOM)< /code> 抛出以下异常:
org.jdom.IllegalNameException: The name "html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"" is not legal for JDOM/XML DocTypes: XML names cannot contain the character " ".
at org.jdom.DocType.setElementName(DocType.java:171)
at org.jdom.DocType.<init>(DocType.java:111)
at org.jdom.DocType.<init>(DocType.java:144)
at org.jdom.DefaultJDOMFactory.docType(DefaultJDOMFactory.java:118)
at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:332)
at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:170)
at org.jdom.input.DOMBuilder.build(DOMBuilder.java:135)
at test.JaxenTest.testParsingVisitor(JaxenTest.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
I need to convert a org.w3c.dom.Document
to org.jdom.Document
I have tried the following following..
InputStream inputStream = new ByteArrayInputStream(str.getBytes());
Tidy tidy = new Tidy();
tidy.setMakeClean(false);
tidy.setShowWarnings(true); //tidy.setShowWarnings(false);
tidy.setTidyMark(false);
tidy.setNumEntities(true);
tidy.setQuoteAmpersand(true);
tidy.setQuoteMarks(true);
tidy.setQuoteNbsp(false);
tidy.setHideEndTags(false);
tidy.setDropEmptyParas(false);
Document tidyDOM =tidy.parseDOM(inputStream, null);
DOMBuilder domBuilder = new DOMBuilder();
org.jdom.Document jdomDoc = domBuilder.build(tidyDOM);
domBuilder.build(tidyDOM)
throws the following exception:
org.jdom.IllegalNameException: The name "html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"" is not legal for JDOM/XML DocTypes: XML names cannot contain the character " ".
at org.jdom.DocType.setElementName(DocType.java:171)
at org.jdom.DocType.<init>(DocType.java:111)
at org.jdom.DocType.<init>(DocType.java:144)
at org.jdom.DefaultJDOMFactory.docType(DefaultJDOMFactory.java:118)
at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:332)
at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:170)
at org.jdom.input.DOMBuilder.build(DOMBuilder.java:135)
at test.JaxenTest.testParsingVisitor(JaxenTest.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在我看来,JTidy 似乎正在创建一个格式错误的 DocType 节点。我建议使用不同的 HTML 解析器。
我推荐 Validator.nu HTML 解析器,但还有很多其他的。
It looks to me as if JTidy is creating a malformed DocType node. I suggest using a different HTML parser.
I recommend The Validator.nu HTML Parser but there are plenty of others.
添加这两个设置,一切都应该正常。
第一个设置告诉 jTidy 输出 XHTML 文件。 XHTML 文件是有效的 XML。
第二个选项卡告诉 tidy 不要将 DOCTYPE 行输出到代码中。由于某种原因,JDom 似乎无法识别合法的 html/xhtml 文档类型。
Add these two settings and everything should work.
The first setting tells jTidy to output an XHTML file. An XHTML file is valid XML.
The second tab tells tidy not to output a DOCTYPE line into the code. For some reason JDom does not seem to recognize legitimate html/xhtml doctypes.