Apache HTTPClient HEAD 请求的最高性能?
我使用 apache commons 4.x HTTPClient 向 URI 发出 HEAD 请求,只是为了获取该链接的最终 post 302 URL 位置。例如: http://bit.ly/test1231 确实指向 cnn.com 或其他东西。使用 HttpClient 在可以运行数月而不泄漏的服务器中实现此目的的最佳和最有效的方法是什么?现在我遇到这样的问题:每 x 分钟所有线程都会冻结,同时尝试将连接从池中拉出,并且它们都超时。
我计划让 100 个工作线程来执行获取操作,因此我使用了线程连接管理器。
更新这是我用来获取 httpClient 对象的代码
HttpParams httpParams = new BasicHttpParams();
HttpConnectionParams.setConnectionTimeout(httpParams, 5000);
HttpConnectionParams.setSoTimeout(httpParams, 5000);
ConnManagerParams.setMaxTotalConnections(httpParams, 5000);
HttpProtocolParams.setVersion(httpParams, HttpVersion.HTTP_1_1);
ConnManagerParams.setMaxConnectionsPerRoute(httpParams, new ConnPerRoute() {
@Override
public int getMaxForRoute(HttpRoute route) {
return 35;
}
});
emptyCookieStore = new CookieStore() {
@Override
public void addCookie(Cookie cookie) {
}
ArrayList<Cookie> emptyList = new ArrayList<Cookie>();
@Override
public List<Cookie> getCookies() {
return emptyList;
}
@Override
public boolean clearExpired(Date date) {
return false;
}
@Override
public void clear() {
}
};
// set request params
httpParams.setParameter("http.protocol.cookie-policy", CookiePolicy.BROWSER_COMPATIBILITY);
httpParams.setParameter("http.useragent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");
httpParams.setParameter("http.language.Accept-Language", "en-us");
httpParams.setParameter("http.protocol.content-charset", "UTF-8");
httpParams.setParameter("Accept", "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5");
httpParams.setParameter("Cache-Control", "max-age=0");
SchemeRegistry schemeRegistry = new SchemeRegistry();
schemeRegistry.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));
schemeRegistry.register(new Scheme("https", PlainSocketFactory.getSocketFactory(), 443));
final ClientConnectionManager cm = new ThreadSafeClientConnManager(httpParams,schemeRegistry);
DefaultHttpClient httpClient = new DefaultHttpClient(cm, httpParams);
httpClient.getParams().setParameter("http.conn-manager.timeout", 120000L);
httpClient.getParams().setParameter("http.protocol.wait-for-continue", 10000L);
httpClient.getParams().setParameter("http.tcp.nodelay", true);
I'm using the apache commons 4.x HTTPClient to make HEAD requests to URIs only to get the final post 302 URL location of that link. E.g: http://bit.ly/test1231 really points to cnn.com or something. What would be the best and most efficient way using HttpClient to achieve this in a server that could run for months with out leaking? Right now I'm running into the issue that every x minutes all the threads freeze while trying to pull a connection out of the pool and they all time out.
I'm planning on having 100 worker threads doing the fetching, so I was using the Threaded connection manager.
UPDATE Here is the Code I'm using to get an httpClient object
HttpParams httpParams = new BasicHttpParams();
HttpConnectionParams.setConnectionTimeout(httpParams, 5000);
HttpConnectionParams.setSoTimeout(httpParams, 5000);
ConnManagerParams.setMaxTotalConnections(httpParams, 5000);
HttpProtocolParams.setVersion(httpParams, HttpVersion.HTTP_1_1);
ConnManagerParams.setMaxConnectionsPerRoute(httpParams, new ConnPerRoute() {
@Override
public int getMaxForRoute(HttpRoute route) {
return 35;
}
});
emptyCookieStore = new CookieStore() {
@Override
public void addCookie(Cookie cookie) {
}
ArrayList<Cookie> emptyList = new ArrayList<Cookie>();
@Override
public List<Cookie> getCookies() {
return emptyList;
}
@Override
public boolean clearExpired(Date date) {
return false;
}
@Override
public void clear() {
}
};
// set request params
httpParams.setParameter("http.protocol.cookie-policy", CookiePolicy.BROWSER_COMPATIBILITY);
httpParams.setParameter("http.useragent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");
httpParams.setParameter("http.language.Accept-Language", "en-us");
httpParams.setParameter("http.protocol.content-charset", "UTF-8");
httpParams.setParameter("Accept", "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5");
httpParams.setParameter("Cache-Control", "max-age=0");
SchemeRegistry schemeRegistry = new SchemeRegistry();
schemeRegistry.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));
schemeRegistry.register(new Scheme("https", PlainSocketFactory.getSocketFactory(), 443));
final ClientConnectionManager cm = new ThreadSafeClientConnManager(httpParams,schemeRegistry);
DefaultHttpClient httpClient = new DefaultHttpClient(cm, httpParams);
httpClient.getParams().setParameter("http.conn-manager.timeout", 120000L);
httpClient.getParams().setParameter("http.protocol.wait-for-continue", 10000L);
httpClient.getParams().setParameter("http.tcp.nodelay", true);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
很可能有太多工作线程争夺很少的连接。请确保每个路由的最大连接数限制设置为合理的值(默认情况下,根据 HTTP 规范的要求,限制设置为两个并发连接)
Most likely you have too many worker threads contending for very few connections. Please make sure the maximum connections per route limit is set to a reasonable value (Per default the limit is set to two concurrent connections as required by the HTTP specification)