使用 Android 的 XmlPullParsers 解析无效的 & 符号

发布于 2024-08-21 10:23:35 字数 580 浏览 10 评论 0原文

我正在编写一个消耗一些 XHTML 的小型屏幕抓取应用程序 - 不言而喻,XHTML 是无效的:& 符号不会被转义为 &

我正在使用 Android 的 XmlPullParser,它会在编码错误的值上出现以下错误:

org.xmlpull.v1.XmlPullParserException: unterminated entity ref 
(position:START_TAG <a href='/Fahrinfo/bin/query.bin/dox?ld=0.1&n=3&i=9c.0323581.1266265347&rt=0&vcra'>
@55:134 in java.io.InputStreamReader@43b1ef70) 

我该如何解决这个问题?我考虑过以下解决方案:

  1. InputStream 包装在另一个用实体引用替换 & 符号的解决方案中
  2. 配置解析器,使其神奇地接受不正确的标记

哪些可能更成功?

I am writing a little screen-scraping app that consumes some XHTML - it goes without saying that the XHTML is invalid: ampersands aren't escaped as &.

I am using Android's XmlPullParser and it spews out the following error upon the incorrectly encoded value:

org.xmlpull.v1.XmlPullParserException: unterminated entity ref 
(position:START_TAG <a href='/Fahrinfo/bin/query.bin/dox?ld=0.1&n=3&i=9c.0323581.1266265347&rt=0&vcra'>
@55:134 in java.io.InputStreamReader@43b1ef70) 

How do I get around this? I have thought about the following solutions:

  1. Wrapping the InputStream in another one that replaces the ampersands with entity refs
  2. Configuring the Parser so it magically accepts the incorrect markup

Which ones is likely to be more successful?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

幽蝶幻影 2024-08-28 10:23:35

我在这个问题上坚持了大约一个小时,然后才发现在我的例子中它是“&” XML PULL PARSER 无法解决这个问题,所以我找到了解决方案。所以这里有一段代码可以完全解决这个问题。

void ParsingActivity(String r) {
    try {
        parserCreator = XmlPullParserFactory.newInstance();
        parser = parserCreator.newPullParser();
        // Here we give our file object in the form of a stream to the
        // parser.
        parser.setInput(new StringReader(r.replaceAll("&", "&")));
        // as a SAX parser this will raise events/callback as and when it
        // comes to a element.
        int parserEvent = parser.getEventType();
        // we go thru a loop of all elements in the xml till we have
        // reached END of document.
        while (parserEvent != XmlPullParser.END_DOCUMENT) {
            switch (parserEvent) {
            // if u have reached start of a tag
            case XmlPullParser.START_TAG:
                // get the name of the tag
                String tag = parser.getName();

几乎我正在做的事情只是用 & 替换 & 因为我正在处理解析 URL。
希望这有帮助。

I was stuck on this for about an hour before figuring out that in my case it was the "&" that couldn't be resolved by the XML PULL PARSER, so i found the solution. So Here is a snippet of code which totally fix it.

void ParsingActivity(String r) {
    try {
        parserCreator = XmlPullParserFactory.newInstance();
        parser = parserCreator.newPullParser();
        // Here we give our file object in the form of a stream to the
        // parser.
        parser.setInput(new StringReader(r.replaceAll("&", "&")));
        // as a SAX parser this will raise events/callback as and when it
        // comes to a element.
        int parserEvent = parser.getEventType();
        // we go thru a loop of all elements in the xml till we have
        // reached END of document.
        while (parserEvent != XmlPullParser.END_DOCUMENT) {
            switch (parserEvent) {
            // if u have reached start of a tag
            case XmlPullParser.START_TAG:
                // get the name of the tag
                String tag = parser.getName();

pretty much what I'm doing I'm just replacing the & with & since I was dealing with parsing a URL.
Hope this helps.

野鹿林 2024-08-28 10:23:35

我会选择你的第一个选择,替换&符号似乎比另一个更合适的解决方案。第二个选项似乎更像是一种通过接受不正确的标记来使其工作的黑客

I would go with your first option, replacing the ampersands seems more of a fit solution than the other. The second option seems more of a hack to get it to work by accepting incorrect markup.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文