在Java中,线程挂在SocketRead0中,我该怎么办?
我正在开发一个网络爬虫,但通常在执行很短的时间(几分钟)后,一些线程会停止工作。运行调试器,我发现它停在 SocketRead0 处。
当线程使用 HttpURLConnection.getInputStream()
下载页面内容时,就会发生这种情况。
我不知道是什么原因造成的,但我认为这与多线程有关。
有人知道如何解决或避免这个问题吗?
我还没有使用 HttpURLConnection 池,因为我不知道该怎么做。
conn = (HttpURLConnection) new URL(url).openConnection();
conn.setInstanceFollowRedirects(true);
conn.connect();
CountingInputStream content;
try {
content = new CountingInputStream(conn.getInputStream());
//processing of content
content.close();
return true;
} catch (Exception e) {
return false;
}
I'm developing a webcrawler, but often after a short time executing (minutes), some threads stop to do their work. Running a debugger, I found that it stop in SocketRead0.
This occurs when the thread will download the content of a page with a HttpURLConnection.getInputStream()
.
I don't know what causes this, but I think that is associated to the multithreading.
Someone knows how to solve or avoid this?
I'm not using a pool of HttpURLConnection yet beucase I don't know how to do.
conn = (HttpURLConnection) new URL(url).openConnection();
conn.setInstanceFollowRedirects(true);
conn.connect();
CountingInputStream content;
try {
content = new CountingInputStream(conn.getInputStream());
//processing of content
content.close();
return true;
} catch (Exception e) {
return false;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您需要在连接上设置套接字读取超时。这将导致它在指定的时间段后抛出异常而不是挂起。
http://download.oracle .com/javase/1.5.0/docs/api/java/net/URLConnection.html#setReadTimeout(int)
You need to set a socket read timeout on the connection. This will cause it to throw an exception instead of hanging after the specified time period.
http://download.oracle.com/javase/1.5.0/docs/api/java/net/URLConnection.html#setReadTimeout(int)
您正在使用的服务器可能没有在您期望的时候发送数据,并且您的线程被阻塞等待数据。
您使用的原始
java.io.*
类是一个阻塞 I/O 实现,这意味着像InputStream.read()
这样的方法如果没有数据可供读取,则将停止线程 - 调用将等待,直到有数据,如果数据到达,则该方法返回。在 Java 1.4 中,添加了 java.nio 包,这是一个非阻塞 I/O 实现。如果您使用的服务器可能无法可靠地提供服务,我建议您使用它。 以下是一些示例,说明如何使用尼奥。
The server you're using is probably not sending data when you expect it to, and your thread is blcoked waiting for data.
The original
java.io.*
classes you are using are a blocking I/O implementation, which means that methods likeInputStream.read()
will halt the thread if no data is available to read - the call waits until there is data, and if it arrives the method returns.In Java 1.4, the
java.nio
package was added, which is a non-blocking I/O implementation. I recommend you use that if you're using a server that may not serve reliably. Here are some examples of how to use nio.