在 XOM 中解析 XHTML 文档时出现 DTD 下载错误

发布于 2024-07-24 04:59:52 字数 1455 浏览 5 评论 0原文

我正在尝试使用声明使用的 doctype 来解析 HTML 文档 过渡 dtd 如下:

http://www.w3.org/TR/xhtml1/DTD /xhtml1-transitional.dtd">

当我对文档执行 Builder.build 时,出现以下异常:

  java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
       at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1305)
       at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
       at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
       at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
       at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
       at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown Source)
       at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
       at nu.xom.Builder.build(Builder.java:1127)
       at nu.xom.Builder.build(Builder.java:1019)

如果删除 doc 类型声明,它解析得很好。 我可以 成功从我的浏览器下载 dtd,这告诉我 网址有效。 我不想删除文档类型声明。 是 有一种方法告诉构建者不要下载 dtd 或提供它 有替代的 dtd 吗?

I am trying to parse an HTML document with the doctype declared to use
the transitional dtd as follows:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

When I do Builder.build on the document, I get the following exception:

  java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
       at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1305)
       at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
       at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
       at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
       at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
       at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown Source)
       at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
       at nu.xom.Builder.build(Builder.java:1127)
       at nu.xom.Builder.build(Builder.java:1019)

If I remove the doc type declaration, it parses just fine. I can
successfully download the dtd from my browser, which tells me that the
url is valid. I don't want to remove the doc type declaration. Is
there a way tell the builder not to download the dtd or provide it
with an alternate dtd?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

妞丶爷亲个 2024-07-31 04:59:52

这解决了这个问题:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            factory.setValidating(false);
            factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
            Document document = factory.newDocumentBuilder().parse(is);

This solves the problem:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            factory.setValidating(false);
            factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
            Document document = factory.newDocumentBuilder().parse(is);
━╋う一瞬間旳綻放 2024-07-31 04:59:52

快速浏览一下 Builder,我想你可以提供一个 EntityResolver 通过采用 XMLReader。 我会尽可能避免让解析器从互联网下载文件。

Taking a quick look at the javadoc for Builder, I guess you could provide an EntityResolver via the constructor that takes a XMLReader. I would avoid letting the parser download files from the internet where possible.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文