HttpURLConnections 忽略超时并且永不返回

发布于 2024-07-15 10:07:11 字数 263 浏览 4 评论 0原文

当尝试从 HttpURLConnection 打开 InputStream 时,我们从某些服务器随机获得一些意外结果。 看起来这些服务器会接受连接并回复“保持活动”标头,这将使套接字保持打开状态,但不允许数据发送回流。

这种情况使得多线程爬虫的尝试有点“复杂”,因为如果某个连接被卡住,运行它的线程将永远不会返回......否认它的池的完成,该池在控制器中派生,认为某些线程正在运行还在工作。

有没有某种方法可以读取连接响应标头来识别“保持活动”答案并避免尝试打开流?

We are getting some unexpected results randomly from some servers when trying to open an InputStream from an HttpURLConnection. It seems like those servers would accept the connection and reply with a "stay-alive" header which will keep the Socket open but doesn't allow data to be sent back to the stream.

That scenario makes an attempt for a multi-threaded crawler a little "complicated", because if some connection gets stuck, the thread running it would never return... denying the completion of it's pool which derives in the controller thinking that some threads are still working.

Is there some way to read the connection response header to identify that "stay-alive" answer and avoid trying to open the stream??

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

暖伴 2024-07-22 10:07:11

我不确定我在这里缺少什么,但在我看来,您只需要 getHeaderField()?

I'm not sure what I'm missing here but it seems to me you simply need getHeaderField()?

允世 2024-07-22 10:07:11

除了“连接超时”之外,您是否尝试设置“读取超时”?

请参阅 http:// /java.sun.com/j2se/1.5.0/docs/api/java/net/URLConnection.html#setReadTimeout%28int%29

Did you try setting "read time out", in addition to "connect time out"?

See http://java.sun.com/j2se/1.5.0/docs/api/java/net/URLConnection.html#setReadTimeout%28int%29

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文