当前位置：文江博客话题详情

Android java XML 文档元素后的垃圾

发布于 2024-10-12 04:09:37 字数 291 浏览 5 评论 0原文

我正在使用 SAX 来读取/解析 XML 文档，并且它工作正常，除了这个特定站点，其中 Eclipse 告诉我“文档元素后有垃圾”并且我没有返回任何数据

http://www.zachblume.com/apis/rhyme.php?format=xml&word=example

该网站不是我的......只是想从中获取一些数据。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

深海少女心 2024-10-19 04:09:37

是的，那不是 XML 文档。它试图包含多个根元素：

<?xml version="1.0"?> 
<word>ampal</word> 
<word>ample</word> 
<word>hampel</word> 
<word>hample</word> 
<word>lampl</word> 
<word>pampel</word>
<word>sample</word>

解析器将 ampal 之后的所有内容视为此时它已读取完整的文档...因此抱怨“垃圾之后”文档元素”。

一份 XML 文档只能有一个根，但根内可以有多个子项。例如：

<?xml version="1.0"?> 
<words>
  <word>ampal</word> 
  <word>ample</word> 
  <word>hampel</word> 
  <word>hample</word> 
  <word>lampl</word> 
  <word>pampel</word> 
  <word>sample</word>
</words>

Yes, that's not an XML document. It's trying to include more than one root element:

<?xml version="1.0"?> 
<word>ampal</word> 
<word>ample</word> 
<word>hampel</word> 
<word>hample</word> 
<word>lampl</word> 
<word>pampel</word>
<word>sample</word>

The parser regards everything after <word>ampal</word> as by that time it's read a complete document... hence the complain about "junk after document element".

An XML document can only have one root, but several children within the root. For example:

<?xml version="1.0"?> 
<words>
  <word>ampal</word> 
  <word>ample</word> 
  <word>hampel</word> 
  <word>hample</word> 
  <word>lampl</word> 
  <word>pampel</word> 
  <word>sample</word>
</words>

回复收藏 0 原文

灯角 2024-10-19 04:09:37

该页面不包含 XML。它最多包含一个 XML 片段：

<?xml version="1.0"?> 
<word>ampal</word> 
<word>ample</word> 
<word>hampel</word> 
<word>hample</word> 
<word>lampl</word> 
<word>pampel</word> 
<word>sample</word>

这是不正确的，因为没有文档元素。 SAX 将第一个解释为文档元素，并正确报告“文档元素后的垃圾”，因为就其所知，文档元素在第 1 行结束。

要解决该错误，请执行以下操作不将此文档视为 XML。将其下载为文本，删除 XML 声明 ()，然后在尝试处理它之前将其包装在假文档元素中。

The page does not contain XML. It contains an XML snippet at best:

<?xml version="1.0"?> 
<word>ampal</word> 
<word>ample</word> 
<word>hampel</word> 
<word>hample</word> 
<word>lampl</word> 
<word>pampel</word> 
<word>sample</word>

This is incorrect since there is no document element. SAX interprets the first <word> as the document element, and correctly reports "junk after document element" since for all it knows, the document element ends on line 1.

To get around the error, do not treat this document as XML. Download it as text, remove the XML declaration (<?xml version="1.0"?>) and then wrap it in a fake document element before you try to process it.

回复收藏 0 原文

~没有更多了~