为什么在我停止 SAX 解析器后我的 InputStream 继续下载文件?

发布于 2024-10-20 17:14:40 字数 2265 浏览 3 评论 0原文

我正在 Android 中开发 RSS 提要解析器,并且已经实现了一个 SAX 解析器,它在大多数情况下都能完美运行。

然而,我在一些测试提要中遇到了问题。在解析了指定数量的提要项后,我抛出 SAXException 来停止解析器,据我所知这是正确的方法。在大多数提要中,这会停止解析,并且我的 catch 块(见下文)会处理并记录 StopParsingException。

然而,在某些提要上,解析器停止解析,但抛出异常和运行 catch 块之间有很长的延迟,在此期间没有进行解析,但有足够的时间来下载整个文件(这就是我怀疑正在发生)。

这是我的设置和错误处理代码:

public boolean parse(){
        SAXParserFactory factory = SAXParserFactory.newInstance();
        try {
            SAXParser parser = factory.newSAXParser();
            URL u = new URL(mUrl);
            URLConnection UC = u.openConnection();
            UC.setConnectTimeout(CONNECT_TIMEOUT);
            UC.setReadTimeout(CONNECT_TIMEOUT);
            InputStreamReader r = new InputStreamReader(UC.getInputStream());
            parser.parse(new InputSource(r), this);     
        }catch(SAXException sax)
        {
            Exception ex = sax.getException();
            if(ex != null)
            {
                if(ex instanceof StopParsingException)
                {
                    //Feed was intentionally stopped (i.e. reached episode limit)
                    DebugLog.w(TAG, "Feed update stopped for: " + mUrl, ex);
                    return true;
                }else
                {
                    //Something went wrong, non-standard error
                    DebugLog.e(TAG, "Feed update failed for: " + mUrl, ex);
                    return false;
                }
            }else{
                //Something went wrong, non-standard error
                DebugLog.e(TAG, "Feed update failed fatally for: " + mUrl, sax);
                return false;
            }

        }
        catch(Exception e){
            DebugLog.e(TAG, "Unknown parse error on feed: "+mUrl, e);
            return false;
        }
        DebugLog.i(TAG, "Entire Feed Parsed successfully: "+mUrl);
        return true;
    }

当满足我的条件之一时,我使用此代码:

throw (new SAXException(new StopParsingException("Max Items reached")));

例如停止解析器。

我的猜测是,当我抛出异常时,SAXParser 停止工作,但 InputSteamReader 继续从服务器下载 rss feed,因为这几乎正是我的日志显示的时间。

我的连接设置是否有问题,导致只有某些服务器不与我合作?

或者有没有办法在抛出 SAXException 之前直接安全地停止该 InputStream,这样我就不会这个问题?

I'm working on an RSS feed parser in Android and I've implemented a SAX parser which works perfectly under most situations.

However, I've run into problems on a few of my test feeds. After I've parsed a specified number of feed items, I throw a SAXException to stop the parser, which AFAIK is the right way to do that. On most feeds, this stops the parsing and my catch block (see below) handles and logs the StopParsingException.

On SOME feeds, however, the parser stops parsing, but there is a long delay between the exception being thrown and my catch block being run, during which no parsing is done, but just enough time passes to download the entire file (which is what I suspect is happening).

Here's my setup and error handling code:

public boolean parse(){
        SAXParserFactory factory = SAXParserFactory.newInstance();
        try {
            SAXParser parser = factory.newSAXParser();
            URL u = new URL(mUrl);
            URLConnection UC = u.openConnection();
            UC.setConnectTimeout(CONNECT_TIMEOUT);
            UC.setReadTimeout(CONNECT_TIMEOUT);
            InputStreamReader r = new InputStreamReader(UC.getInputStream());
            parser.parse(new InputSource(r), this);     
        }catch(SAXException sax)
        {
            Exception ex = sax.getException();
            if(ex != null)
            {
                if(ex instanceof StopParsingException)
                {
                    //Feed was intentionally stopped (i.e. reached episode limit)
                    DebugLog.w(TAG, "Feed update stopped for: " + mUrl, ex);
                    return true;
                }else
                {
                    //Something went wrong, non-standard error
                    DebugLog.e(TAG, "Feed update failed for: " + mUrl, ex);
                    return false;
                }
            }else{
                //Something went wrong, non-standard error
                DebugLog.e(TAG, "Feed update failed fatally for: " + mUrl, sax);
                return false;
            }

        }
        catch(Exception e){
            DebugLog.e(TAG, "Unknown parse error on feed: "+mUrl, e);
            return false;
        }
        DebugLog.i(TAG, "Entire Feed Parsed successfully: "+mUrl);
        return true;
    }

When one of my conditions is met, I use this code:

throw (new SAXException(new StopParsingException("Max Items reached")));

for example to stop the parser.

My guess is that when I throw the exception, the SAXParser stops working, but the InputSteamReader continues to download the rss feed from the server, since that is almost exactly the timing that my logs reveal.

Is there something wrong with my connection setup that makes only some servers not cooperate with me?

Alternatively is there a way to directly stop that InputStream safely before throwing my SAXException so that I don't have this problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

回忆凄美了谁 2024-10-27 17:14:40

SAX 解析器可能会尝试以某种方式“恢复”并读取您的输入流以(例如)关闭匹配标签。如果发生这种情况,您可以通过在抛出异常之前关闭输入流来防止这种情况发生。

另一种选择是注册一个 错误处理程序。事实上,javadoc 表明这可能是更正确的方法。

It is just possible that the SAX parser is trying to "recover" in some way and reading your input stream to (for example) close matching tags. If this was occurring, you could probably prevent it by closing the input stream before throwing your exception.

Another alternative would be to register an error handler. In fact, the javadoc suggests that this could be the more correct approach.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文