Apache HTTPClient HEAD 请求的最高性能?

发布于 2024-09-17 21:53:06 字数 2479 浏览 15 评论 0原文

我使用 apache commons 4.x HTTPClient 向 URI 发出 HEAD 请求,只是为了获取该链接的最终 post 302 URL 位置。例如: http://bit.ly/test1231 确实指向 cnn.com 或其他东西。使用 HttpClient 在可以运行数月而不泄漏的服务器中实现此目的的最佳和最有效的方法是什么?现在我遇到这样的问题:每 x 分钟所有线程都会冻结,同时尝试将连接从池中拉出,并且它们都超时。

我计划让 100 个工作线程来执行获取操作,因此我使用了线程连接管理器。

更新这是我用来获取 httpClient 对象的代码

HttpParams httpParams = new BasicHttpParams();

HttpConnectionParams.setConnectionTimeout(httpParams, 5000);

HttpConnectionParams.setSoTimeout(httpParams, 5000);

ConnManagerParams.setMaxTotalConnections(httpParams, 5000);

HttpProtocolParams.setVersion(httpParams, HttpVersion.HTTP_1_1);



ConnManagerParams.setMaxConnectionsPerRoute(httpParams, new ConnPerRoute() {

   @Override

   public int getMaxForRoute(HttpRoute route) {

     return 35;

   }

 });

emptyCookieStore = new CookieStore() {

    @Override

    public void addCookie(Cookie cookie) {



    }

    ArrayList<Cookie> emptyList = new ArrayList<Cookie>();



    @Override

    public List<Cookie> getCookies() {

      return emptyList;

    }

    @Override

    public boolean clearExpired(Date date) {

      return false;

    }



    @Override

    public void clear() {

    }

  };



  // set request params

  httpParams.setParameter("http.protocol.cookie-policy", CookiePolicy.BROWSER_COMPATIBILITY);

  httpParams.setParameter("http.useragent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");

  httpParams.setParameter("http.language.Accept-Language", "en-us");

  httpParams.setParameter("http.protocol.content-charset", "UTF-8");

  httpParams.setParameter("Accept", "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5");

  httpParams.setParameter("Cache-Control", "max-age=0");

  SchemeRegistry schemeRegistry = new SchemeRegistry();

  schemeRegistry.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));

  schemeRegistry.register(new Scheme("https", PlainSocketFactory.getSocketFactory(), 443));

  final ClientConnectionManager cm = new ThreadSafeClientConnManager(httpParams,schemeRegistry);



  DefaultHttpClient httpClient = new DefaultHttpClient(cm, httpParams);

  httpClient.getParams().setParameter("http.conn-manager.timeout", 120000L);

  httpClient.getParams().setParameter("http.protocol.wait-for-continue", 10000L);

  httpClient.getParams().setParameter("http.tcp.nodelay", true);

I'm using the apache commons 4.x HTTPClient to make HEAD requests to URIs only to get the final post 302 URL location of that link. E.g: http://bit.ly/test1231 really points to cnn.com or something. What would be the best and most efficient way using HttpClient to achieve this in a server that could run for months with out leaking? Right now I'm running into the issue that every x minutes all the threads freeze while trying to pull a connection out of the pool and they all time out.

I'm planning on having 100 worker threads doing the fetching, so I was using the Threaded connection manager.

UPDATE Here is the Code I'm using to get an httpClient object

HttpParams httpParams = new BasicHttpParams();

HttpConnectionParams.setConnectionTimeout(httpParams, 5000);

HttpConnectionParams.setSoTimeout(httpParams, 5000);

ConnManagerParams.setMaxTotalConnections(httpParams, 5000);

HttpProtocolParams.setVersion(httpParams, HttpVersion.HTTP_1_1);



ConnManagerParams.setMaxConnectionsPerRoute(httpParams, new ConnPerRoute() {

   @Override

   public int getMaxForRoute(HttpRoute route) {

     return 35;

   }

 });

emptyCookieStore = new CookieStore() {

    @Override

    public void addCookie(Cookie cookie) {



    }

    ArrayList<Cookie> emptyList = new ArrayList<Cookie>();



    @Override

    public List<Cookie> getCookies() {

      return emptyList;

    }

    @Override

    public boolean clearExpired(Date date) {

      return false;

    }



    @Override

    public void clear() {

    }

  };



  // set request params

  httpParams.setParameter("http.protocol.cookie-policy", CookiePolicy.BROWSER_COMPATIBILITY);

  httpParams.setParameter("http.useragent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");

  httpParams.setParameter("http.language.Accept-Language", "en-us");

  httpParams.setParameter("http.protocol.content-charset", "UTF-8");

  httpParams.setParameter("Accept", "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5");

  httpParams.setParameter("Cache-Control", "max-age=0");

  SchemeRegistry schemeRegistry = new SchemeRegistry();

  schemeRegistry.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));

  schemeRegistry.register(new Scheme("https", PlainSocketFactory.getSocketFactory(), 443));

  final ClientConnectionManager cm = new ThreadSafeClientConnManager(httpParams,schemeRegistry);



  DefaultHttpClient httpClient = new DefaultHttpClient(cm, httpParams);

  httpClient.getParams().setParameter("http.conn-manager.timeout", 120000L);

  httpClient.getParams().setParameter("http.protocol.wait-for-continue", 10000L);

  httpClient.getParams().setParameter("http.tcp.nodelay", true);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

世俗缘 2024-09-24 21:53:06

很可能有太多工作线程争夺很少的连接。请确保每个路由的最大连接数限制设置为合理的值(默认情况下,根据 HTTP 规范的要求,限制设置为两个并发连接)

Most likely you have too many worker threads contending for very few connections. Please make sure the maximum connections per route limit is set to a reasonable value (Per default the limit is set to two concurrent connections as required by the HTTP specification)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文