HttpClient 的多线程问题

发布于 2024-11-01 19:45:30 字数 3186 浏览 1 评论 0原文

我遇到了 HttpClient 的多线程问题,其中出现以下情况:

线程 A 将发出 url http://blap .com?param=2

线程 B 将发出 url http://blap.com?param=3< /a>

这对 98% 的工作有效时间,但偶尔线程 A 会收到线程 B 的 url 数据,反之亦然。

现在每个线程都在创建它自己的 HttpClient 实例,所以我认为理论上我不需要使用 MultiThreadedHttpConnectionManager。

我所描述的行为似乎合理吗?是否可以通过使用 MultiThreadedHttpConnectionManager 来修复它?

我正在使用 java 1.6 和 apache http 客户端组件 4.0.3。

更新: 这是有问题的函数。

public void get_url(String strDataSet) throws SQLException, MalformedURLException, IOException
{

      String query;



        query = "select * from jobs where data_set='" + strDataSet + "'";

        ResultSet rs2 = dbf.db_run_query(query);
        rs2.next();


        HttpClient httpclient = new DefaultHttpClient();
        HttpResponse response;



            String strURL;
            strURL = rs2.getString("url_static");

            if (rs2.getString("url_dynamic")!=null && !rs2.getString("url_dynamic").isEmpty())
                strURL = strURL.replace("${dynamic}", rs2.getString("url_dynamic"));

            UtilityFunctions.stdoutwriter.writeln("Retrieving URL: " + strURL,Logs.STATUS2,"DG25");

            if (!strURL.contains(":"))
                UtilityFunctions.stdoutwriter.writeln("WARNING: url is not preceeded with a protocol" + strURL,Logs.STATUS1,"DG25.5");

            //HttpGet chokes on the ^ character

            strURL = strURL.replace("^","%5E");


            HttpGet httpget = new HttpGet(strURL); 


            /*
             * The following line fixes an issue where a non-fatal error is displayed about an invalid cookie data format.
             * It turns out that some sites generate a warning with this code, and others without it.
             * I'm going to kludge this for now until I get more data on which urls throw the
             * warning and which don't.
             * 
             * warning with code: www.exchange-rates.org
             */


                if (!(strCurDataSet.contains("xrateorg") || strCurDataSet.contains("google") || strCurDataSet.contains("mwatch")))
                {
                    httpget.getParams().setParameter("http.protocol.cookie-datepatterns", 
                            Arrays.asList("EEE, dd MMM-yyyy-HH:mm:ss z", "EEE, dd MMM yyyy HH:mm:ss z"));
                }







            response = httpclient.execute(httpget);




         HttpEntity entity = response.getEntity();

          BufferedReader in = new BufferedReader(
                    new InputStreamReader(
                    entity.getContent()));



      int nTmp;         

      returned_content="";




      while ((nTmp = in.read()) != -1)
        returned_content = returned_content + (char)nTmp;


      in.close();

      httpclient.getConnectionManager().shutdown();

      UtilityFunctions.stdoutwriter.writeln("Done reading url contents",Logs.STATUS2,"DG26");



}

更新: 我将问题范围缩小到这一行:

response = httpclient.execute(httpget);

如果我在那条线周围放置一个线程锁,问题就消失了。问题是,这是最耗时的部分,我不希望一次只有一个线程能够检索 http 数据。

I'm running into a multithreading issue with HttpClient where I have the following scenario:

Thread A will issue url http://blap.com?param=2

Thread B will issue url http://blap.com?param=3

and this works about 98% of the time, but occasionally Thread A will receive the data for Thread B's url and vice-versa.

Now each thread is creating it's own HttpClient instance so I thought in theory I wouldn't need to use MultiThreadedHttpConnectionManager.

Does the behavior I'm describing seem plausible and will it be fixed by using MultiThreadedHttpConnectionManager?

I'm using java 1.6 and apache http client components 4.0.3.

Update:
Here's the function in question.

public void get_url(String strDataSet) throws SQLException, MalformedURLException, IOException
{

      String query;



        query = "select * from jobs where data_set='" + strDataSet + "'";

        ResultSet rs2 = dbf.db_run_query(query);
        rs2.next();


        HttpClient httpclient = new DefaultHttpClient();
        HttpResponse response;



            String strURL;
            strURL = rs2.getString("url_static");

            if (rs2.getString("url_dynamic")!=null && !rs2.getString("url_dynamic").isEmpty())
                strURL = strURL.replace("${dynamic}", rs2.getString("url_dynamic"));

            UtilityFunctions.stdoutwriter.writeln("Retrieving URL: " + strURL,Logs.STATUS2,"DG25");

            if (!strURL.contains(":"))
                UtilityFunctions.stdoutwriter.writeln("WARNING: url is not preceeded with a protocol" + strURL,Logs.STATUS1,"DG25.5");

            //HttpGet chokes on the ^ character

            strURL = strURL.replace("^","%5E");


            HttpGet httpget = new HttpGet(strURL); 


            /*
             * The following line fixes an issue where a non-fatal error is displayed about an invalid cookie data format.
             * It turns out that some sites generate a warning with this code, and others without it.
             * I'm going to kludge this for now until I get more data on which urls throw the
             * warning and which don't.
             * 
             * warning with code: www.exchange-rates.org
             */


                if (!(strCurDataSet.contains("xrateorg") || strCurDataSet.contains("google") || strCurDataSet.contains("mwatch")))
                {
                    httpget.getParams().setParameter("http.protocol.cookie-datepatterns", 
                            Arrays.asList("EEE, dd MMM-yyyy-HH:mm:ss z", "EEE, dd MMM yyyy HH:mm:ss z"));
                }







            response = httpclient.execute(httpget);




         HttpEntity entity = response.getEntity();

          BufferedReader in = new BufferedReader(
                    new InputStreamReader(
                    entity.getContent()));



      int nTmp;         

      returned_content="";




      while ((nTmp = in.read()) != -1)
        returned_content = returned_content + (char)nTmp;


      in.close();

      httpclient.getConnectionManager().shutdown();

      UtilityFunctions.stdoutwriter.writeln("Done reading url contents",Logs.STATUS2,"DG26");



}

Update:
I narrowed the problem down to the line:

response = httpclient.execute(httpget);

If I put a thread lock around that line, the problem went away. The thing is, that's the most time consuming piece and I don't want only one thread to be able to retrieve http data at a time.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

紫﹏色ふ单纯 2024-11-08 19:45:30

您的代码不是线程安全的。要解决眼前的问题,您需要将 HttpClient 声明为 ThreadLocal,但还有很多问题需要解决。

Your code's not thread-safe. To fix your immediate problem, you need to declare your HttpClient as a ThreadLocal, but there's a lot more to fix.

乖不如嘢 2024-11-08 19:45:30

您需要在每个线程中创建一个新的 HttpContext 并将其传递给 HttpClient.execute:

HttpContext localContext = new BasicHttpClient();
response = httpclient.execute(httpget, localContext);

请参阅本文档的底部(来自 HttpClient 4):

http://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html

还有一个线程安全的 HttpContext 实现( SyncBasicHttpContext),但我不确定在这种情况下您是否需要它。

You need to create a new HttpContext in each thread and pass it to HttpClient.execute:

HttpContext localContext = new BasicHttpClient();
response = httpclient.execute(httpget, localContext);

See the bottom of this doc (from HttpClient 4):

http://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html

Also there's a thread safe HttpContext implementation (SyncBasicHttpContext) but I'm not sure if you would need it in this case.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文