使用 Android 的 XmlPullParsers 解析无效的 & 符号
我正在编写一个消耗一些 XHTML 的小型屏幕抓取应用程序 - 不言而喻,XHTML 是无效的:& 符号不会被转义为 &
。
我正在使用 Android 的 XmlPullParser
,它会在编码错误的值上出现以下错误:
org.xmlpull.v1.XmlPullParserException: unterminated entity ref
(position:START_TAG <a href='/Fahrinfo/bin/query.bin/dox?ld=0.1&n=3&i=9c.0323581.1266265347&rt=0&vcra'>
@55:134 in java.io.InputStreamReader@43b1ef70)
我该如何解决这个问题?我考虑过以下解决方案:
- 将
InputStream
包装在另一个用实体引用替换 & 符号的解决方案中 - 配置解析器,使其神奇地接受不正确的标记
哪些可能更成功?
I am writing a little screen-scraping app that consumes some XHTML - it goes without saying that the XHTML is invalid: ampersands aren't escaped as &
.
I am using Android's XmlPullParser
and it spews out the following error upon the incorrectly encoded value:
org.xmlpull.v1.XmlPullParserException: unterminated entity ref
(position:START_TAG <a href='/Fahrinfo/bin/query.bin/dox?ld=0.1&n=3&i=9c.0323581.1266265347&rt=0&vcra'>
@55:134 in java.io.InputStreamReader@43b1ef70)
How do I get around this? I have thought about the following solutions:
- Wrapping the
InputStream
in another one that replaces the ampersands with entity refs - Configuring the Parser so it magically accepts the incorrect markup
Which ones is likely to be more successful?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我在这个问题上坚持了大约一个小时,然后才发现在我的例子中它是“&” XML PULL PARSER 无法解决这个问题,所以我找到了解决方案。所以这里有一段代码可以完全解决这个问题。
几乎我正在做的事情只是用
&
替换&
因为我正在处理解析 URL。希望这有帮助。
I was stuck on this for about an hour before figuring out that in my case it was the "&" that couldn't be resolved by the XML PULL PARSER, so i found the solution. So Here is a snippet of code which totally fix it.
pretty much what I'm doing I'm just replacing the
&
with&
since I was dealing with parsing a URL.Hope this helps.
我会选择你的第一个选择,替换&符号似乎比另一个更合适的解决方案。第二个选项似乎更像是一种通过接受不正确的标记来使其工作的黑客。
I would go with your first option, replacing the ampersands seems more of a fit solution than the other. The second option seems more of a hack to get it to work by accepting incorrect markup.