BindException/在负载下使用 HttpClient 时打开太多文件
我有 1000 个专用 Java 线程,每个线程每秒轮询一个相应的 url。
public class Poller {
public static Node poll(Node node) {
GetMethod method = null;
try {
HttpClient client = new HttpClient(new SimpleHttpConnectionManager(true));
......
} catch (IOException ex) {
ex.printStackTrace();
} finally {
method.releaseConnection();
}
}
}
线程每隔一秒运行一次:
for (int i=0; i <1000; i++) {
MyThread thread = threads.get(i) // threads is a static field
if(thread.isAlive()) {
// If the previous thread is still running, let it run.
} else {
thread.start();
}
}
问题是,如果我每隔一秒运行一次作业,我就会收到如下随机异常:
java.net.BindException: Address already in use
INFO httpclient.HttpMethodDirector: I/O exception (java.net.BindException) caught when processing request: Address already in use
INFO httpclient.HttpMethodDirector: Retrying request
但如果我每 2 秒或更长时间运行一次作业,一切都会正常运行。
我什至尝试使用 shutdown() 关闭 SimpleHttpConnectionManager() 的实例,但没有效果。
如果我执行 netstat,我会看到数千个处于 TIME_WAIT 状态的 TCP 连接,这意味着它们已被关闭并正在清除。
因此,为了限制连接数,我尝试使用 HttpClient 的单个实例,并像这样使用它:
public class MyHttpClientFactory {
private static MyHttpClientFactory instance = new HttpClientFactory();
private MultiThreadedHttpConnectionManager connectionManager;
private HttpClient client;
private HttpClientFactory() {
init();
}
public static HttpClientFactory getInstance() {
return instance;
}
public void init() {
connectionManager = new MultiThreadedHttpConnectionManager();
HttpConnectionManagerParams managerParams = new HttpConnectionManagerParams();
managerParams.setMaxTotalConnections(1000);
connectionManager.setParams(managerParams);
client = new HttpClient(connectionManager);
}
public HttpClient getHttpClient() {
if (client != null) {
return client;
} else {
init();
return client;
}
}
}
但是,在运行了整整 2 小时后,它开始抛出“打开文件太多”,最终根本无法执行任何操作。
ERROR java.net.SocketException: Too many open files
INFO httpclient.HttpMethodDirector: I/O exception (java.net.SocketException) caught when processing request: Too many open files
INFO httpclient.HttpMethodDirector: Retrying request
我应该能够增加允许的连接数量并使其正常工作,但我只会延长邪恶的时间。知道在上述情况下使用 HttpClient 的最佳实践是什么吗?
顺便说一句,我仍然使用 HttpClient3.1。
I have got 1000 dedicated Java threads where each thread polls a corresponding url every one second.
public class Poller {
public static Node poll(Node node) {
GetMethod method = null;
try {
HttpClient client = new HttpClient(new SimpleHttpConnectionManager(true));
......
} catch (IOException ex) {
ex.printStackTrace();
} finally {
method.releaseConnection();
}
}
}
The threads are run every one second:
for (int i=0; i <1000; i++) {
MyThread thread = threads.get(i) // threads is a static field
if(thread.isAlive()) {
// If the previous thread is still running, let it run.
} else {
thread.start();
}
}
The problem is if I run the job every one second I get random exceptions like these:
java.net.BindException: Address already in use
INFO httpclient.HttpMethodDirector: I/O exception (java.net.BindException) caught when processing request: Address already in use
INFO httpclient.HttpMethodDirector: Retrying request
But if I run the job every 2 seconds or more, everything runs fine.
I even tried shutting down the instance of SimpleHttpConnectionManager() using shutDown() with no effect.
If I do netstat, I see thousands of TCP connections in TIME_WAIT state, which means they are have been closed and are clearing up.
So to limit the no of connections, I tried using a single instance of HttpClient and use it like this:
public class MyHttpClientFactory {
private static MyHttpClientFactory instance = new HttpClientFactory();
private MultiThreadedHttpConnectionManager connectionManager;
private HttpClient client;
private HttpClientFactory() {
init();
}
public static HttpClientFactory getInstance() {
return instance;
}
public void init() {
connectionManager = new MultiThreadedHttpConnectionManager();
HttpConnectionManagerParams managerParams = new HttpConnectionManagerParams();
managerParams.setMaxTotalConnections(1000);
connectionManager.setParams(managerParams);
client = new HttpClient(connectionManager);
}
public HttpClient getHttpClient() {
if (client != null) {
return client;
} else {
init();
return client;
}
}
}
However after running for exactly 2 hours, it starts throwing 'too many open files' and eventually cannot do anything at all.
ERROR java.net.SocketException: Too many open files
INFO httpclient.HttpMethodDirector: I/O exception (java.net.SocketException) caught when processing request: Too many open files
INFO httpclient.HttpMethodDirector: Retrying request
I should be able to increase the no of connections allowed and make it work, but I would just be prolonging the evil. Any idea what is the best practise to use HttpClient in a situation like above?
Btw, I am still on HttpClient3.1.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
以 sudo 或 root 身份编辑 /etc/security/limits.conf 文件。在文件末尾“# End of File”上方输入以下值:
* 软诺文件 65535
* 硬诺文件 65535
这会将打开文件的数量设置为无限。
As sudo or root edit the /etc/security/limits.conf file. At the end of the file just above “# End of File” enter the following values:
* soft nofile 65535
* hard nofile 65535
This will set the number of open files to unlimited.
几个月前,这件事发生在我们身上。首先,仔细检查以确保您每次都确实调用了releaseConnection()。但即便如此,操作系统实际上也不会立即回收所有 TCP 连接。解决方案是使用 Apache HTTP 客户端的 多线程HttpConnectionManager。这会池化并重用连接。
请参阅 http://hc.apache.org/httpclient-3.x/performance .html 了解更多性能提示。
更新:哎呀,我没有阅读下面的代码示例。如果您正在执行releaseConnection()并使用MultiThreadedHttpConnectionManager,请考虑操作系统对每个进程打开文件的限制是否设置得足够高。我们也遇到了这个问题,需要稍微扩大限制。
This happened to us a few months back. First, double check to make sure you really are calling releaseConnection() every time. But even then, the OS doesn't actually reclaim the TCP connections all at once. The solution is to use the Apache HTTP Client's MultiThreadedHttpConnectionManager. This pools and reuses the connections.
See http://hc.apache.org/httpclient-3.x/performance.html for more performance tips.
Update: Whoops, I didn't read the lower code sample. If you're doing releaseConnection() and using MultiThreadedHttpConnectionManager, consider whether your OS limit on open files per process is set high enough. We had that problem too, and needed to extend the limit a bit.
第一个错误没有任何问题。您刚刚耗尽了可用的经验端口。每个TCP连接可以处于TIME_WAIT状态2分钟。您生成 2000 个/秒。迟早,套接字找不到任何未使用的本地端口,您将收到该错误。 TIME_WAIT正是为此目的而设计的。如果没有它,您的系统可能会劫持以前的连接。
第二个错误意味着您打开了太多套接字。在某些系统上,打开文件数有 1K 的限制。也许您只是由于延迟的套接字和其他打开的文件而达到了该限制。在 Linux 上,您可以使用
但这受系统范围最大值的限制来更改此限制。
There is nothing wrong with first error. You just depleted empirical ports available. Each TCP connection can stay in TIME_WAIT state for 2 minutes. You generate 2000/seconds. Soon or later, the socket can't find any unused local port and you will get that error. TIME_WAIT designed exactly for this purpose. Without it, your system might hijack a previous connection.
The second error means you have too many sockets open. On some system, there is a limit of 1K open files. Maybe you just hit that limit due to lingering sockets and other open files. On Linux, you can change this limit using
But that's limited by a system-wide max value.