libcurl 中的分段错误,多线程

发布于 2024-09-10 18:50:20 字数 4451 浏览 1 评论 0原文

所以我有一堆工作线程执行简单的curl 类,每个工作线程都有自己的curl 简单句柄。他们只在随机网站上进行 HEAD 查找。此外,还提供锁定功能来启用多线程 SSL,如此处所述。除了 2 个网页 ilsole24ore.com (见下面的示例)和 ninemsn.com.au/ 之外,一切正常,它们有时会产生段错误,如此处所示的跟踪输出所示

 #0  *__GI___libc_res_nquery (statp=0xb4d12df4, name=0x849e9bd "ilsole24ore.com", class=1, type=1, answer=0xb4d0ca10 "", anslen=1024, answerp=0xb4d0d234,
        answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:182
    #1  0x00434e8b in __libc_res_nquerydomain (statp=0xb4d12df4, name=0xb4d0ca10 "", domain=0x0, class=1, type=1, answer=0xb4d0ca10 "", anslen=1024,
        answerp=0xb4d0d234, answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:576
    #2  0x004352b5 in *__GI___libc_res_nsearch (statp=0xb4d12df4, name=0x849e9bd "ilsole24ore.com", class=1, type=1, answer=0xb4d0ca10 "", anslen=1024,
        answerp=0xb4d0d234, answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:377
    #3  0x009c0bd6 in *__GI__nss_dns_gethostbyname3_r (name=0x849e9bd "ilsole24ore.com", af=2, result=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512,
        errnop=0xb4d12b30, h_errnop=0xb4d0d614, ttlp=0x0, canonp=0x0) at nss_dns/dns-host.c:197
    #4  0x009c0f2b in _nss_dns_gethostbyname2_r (name=0x849e9bd "ilsole24ore.com", af=2, result=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512,
        errnop=0xb4d12b30, h_errnop=0xb4d0d614) at nss_dns/dns-host.c:251
    #5  0x0079eacd in __gethostbyname2_r (name=0x849e9bd "ilsole24ore.com", af=2, resbuf=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512, result=0xb4d0d618,
        h_errnop=0xb4d0d614) at ../nss/getXXbyYY_r.c:253
    #6  0x00760010 in gaih_inet (name=<value optimized out>, service=<value optimized out>, req=0xb4d0f83c, pai=0xb4d0d764, naddrs=0xb4d0d754)
        at ../sysdeps/posix/getaddrinfo.c:531
    #7  0x00761a65 in *__GI_getaddrinfo (name=0x849e9bd "ilsole24ore.com", service=0x0, hints=0xb4d0f83c, pai=0xb4d0f860) at ../sysdeps/posix/getaddrinfo.c:2160
    #8  0x00917f9a in ?? () from /usr/lib/libkrb5support.so.0
    #9  0x003b2f45 in krb5_sname_to_principal () from /usr/lib/libkrb5.so.3
    #10 0x0028a278 in ?? () from /usr/lib/libgssapi_krb5.so.2
    #11 0x0027eff2 in ?? () from /usr/lib/libgssapi_krb5.so.2
    #12 0x0027fb00 in gss_init_sec_context () from /usr/lib/libgssapi_krb5.so.2
    #13 0x00d8770e in ?? () from /usr/lib/libcurl.so.4
    #14 0x00d62c27 in ?? () from /usr/lib/libcurl.so.4
    #15 0x00d7e25b in ?? () from /usr/lib/libcurl.so.4
    #16 0x00d7e597 in ?? () from /usr/lib/libcurl.so.4
    #17 0x00d7f133 in curl_easy_perform () from /usr/lib/libcurl.so.4

我的函数看起来像这样

int do_http_check(taskinfo *info,standardResult *data)
{
    standardResultInit(data);

    char errorBuffer[CURL_ERROR_SIZE];

    CURL *curl;
    CURLcode result;

    curl = curl_easy_init();

    if(curl)
    {
        //required options first
        curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, errorBuffer);
        curl_easy_setopt(curl, CURLOPT_URL, info->address.c_str());
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writer);
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &data->body);
        curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, writer);
        curl_easy_setopt(curl, CURLOPT_WRITEHEADER, &data->head);
        curl_easy_setopt(curl, CURLOPT_DNS_USE_GLOBAL_CACHE,0);
        curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 30 );
        curl_easy_setopt(curl, CURLOPT_NOSIGNAL,1);
        curl_easy_setopt(curl, CURLOPT_NOBODY,1);
        curl_easy_setopt(curl, CURLOPT_TIMEOUT ,240);

        //optional options
        if(info->options.follow)
        {
            curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
            curl_easy_setopt(curl, CURLOPT_MAXREDIRS, info->options.redirects);
        }

        result = curl_easy_perform(curl);

        if (result == CURLE_OK)
        {
            data->success = true;
            curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&data->httpMsg);
            curl_easy_getinfo(curl,CURLINFO_REDIRECT_COUNT,&data->numRedirects);
            data->msg = "OK";
        }
        else
        {
            ... handle error
        }


    return 1;
}

现在,当我调用函数时没有任何线程,只是从 main 调用它,它永远不会中断,所以我在想它连接到线程,或者可能是如何返回数据返回结构,但从我在跟踪中看到的情况看来,故障是在 easy_perform() 调用中生成的,这让我很困惑。 因此,如果有人知道我接下来应该去哪里,那将是最有帮助的,谢谢。

So I've got a bunch of worker threads doing simple curl class, each worker thread has his own curl easy handle. They are doing only HEAD lookups on random web sites. Also locking functions are present to enable multi threaded SSL as documented here. Everything is working except on 2 web pages ilsole24ore.com ( seen in example down ), and ninemsn.com.au/ , they sometimes produce seg fault as shown in trace output shown here

 #0  *__GI___libc_res_nquery (statp=0xb4d12df4, name=0x849e9bd "ilsole24ore.com", class=1, type=1, answer=0xb4d0ca10 "", anslen=1024, answerp=0xb4d0d234,
        answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:182
    #1  0x00434e8b in __libc_res_nquerydomain (statp=0xb4d12df4, name=0xb4d0ca10 "", domain=0x0, class=1, type=1, answer=0xb4d0ca10 "", anslen=1024,
        answerp=0xb4d0d234, answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:576
    #2  0x004352b5 in *__GI___libc_res_nsearch (statp=0xb4d12df4, name=0x849e9bd "ilsole24ore.com", class=1, type=1, answer=0xb4d0ca10 "", anslen=1024,
        answerp=0xb4d0d234, answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:377
    #3  0x009c0bd6 in *__GI__nss_dns_gethostbyname3_r (name=0x849e9bd "ilsole24ore.com", af=2, result=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512,
        errnop=0xb4d12b30, h_errnop=0xb4d0d614, ttlp=0x0, canonp=0x0) at nss_dns/dns-host.c:197
    #4  0x009c0f2b in _nss_dns_gethostbyname2_r (name=0x849e9bd "ilsole24ore.com", af=2, result=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512,
        errnop=0xb4d12b30, h_errnop=0xb4d0d614) at nss_dns/dns-host.c:251
    #5  0x0079eacd in __gethostbyname2_r (name=0x849e9bd "ilsole24ore.com", af=2, resbuf=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512, result=0xb4d0d618,
        h_errnop=0xb4d0d614) at ../nss/getXXbyYY_r.c:253
    #6  0x00760010 in gaih_inet (name=<value optimized out>, service=<value optimized out>, req=0xb4d0f83c, pai=0xb4d0d764, naddrs=0xb4d0d754)
        at ../sysdeps/posix/getaddrinfo.c:531
    #7  0x00761a65 in *__GI_getaddrinfo (name=0x849e9bd "ilsole24ore.com", service=0x0, hints=0xb4d0f83c, pai=0xb4d0f860) at ../sysdeps/posix/getaddrinfo.c:2160
    #8  0x00917f9a in ?? () from /usr/lib/libkrb5support.so.0
    #9  0x003b2f45 in krb5_sname_to_principal () from /usr/lib/libkrb5.so.3
    #10 0x0028a278 in ?? () from /usr/lib/libgssapi_krb5.so.2
    #11 0x0027eff2 in ?? () from /usr/lib/libgssapi_krb5.so.2
    #12 0x0027fb00 in gss_init_sec_context () from /usr/lib/libgssapi_krb5.so.2
    #13 0x00d8770e in ?? () from /usr/lib/libcurl.so.4
    #14 0x00d62c27 in ?? () from /usr/lib/libcurl.so.4
    #15 0x00d7e25b in ?? () from /usr/lib/libcurl.so.4
    #16 0x00d7e597 in ?? () from /usr/lib/libcurl.so.4
    #17 0x00d7f133 in curl_easy_perform () from /usr/lib/libcurl.so.4

My function looks something like this

int do_http_check(taskinfo *info,standardResult *data)
{
    standardResultInit(data);

    char errorBuffer[CURL_ERROR_SIZE];

    CURL *curl;
    CURLcode result;

    curl = curl_easy_init();

    if(curl)
    {
        //required options first
        curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, errorBuffer);
        curl_easy_setopt(curl, CURLOPT_URL, info->address.c_str());
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writer);
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &data->body);
        curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, writer);
        curl_easy_setopt(curl, CURLOPT_WRITEHEADER, &data->head);
        curl_easy_setopt(curl, CURLOPT_DNS_USE_GLOBAL_CACHE,0);
        curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 30 );
        curl_easy_setopt(curl, CURLOPT_NOSIGNAL,1);
        curl_easy_setopt(curl, CURLOPT_NOBODY,1);
        curl_easy_setopt(curl, CURLOPT_TIMEOUT ,240);

        //optional options
        if(info->options.follow)
        {
            curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
            curl_easy_setopt(curl, CURLOPT_MAXREDIRS, info->options.redirects);
        }

        result = curl_easy_perform(curl);

        if (result == CURLE_OK)
        {
            data->success = true;
            curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&data->httpMsg);
            curl_easy_getinfo(curl,CURLINFO_REDIRECT_COUNT,&data->numRedirects);
            data->msg = "OK";
        }
        else
        {
            ... handle error
        }


    return 1;
}

Now, when i call function without any threads, just calling it from main it never breaks, so I was thinking its connected to threads, or maybe how data return structure is being returned, but from what I saw in trace it looks like fault is generated in easy_perform() call, and its confusing me.
So if someone has any idea where should i look next it would be most helpful, thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

自在安然 2024-09-17 18:50:20

libcurl 到多线程

第一个基本规则是你必须
永远不要共享 libcurl 句柄(无论是
简单或多重或其他)之间
多线程。仅使用一个手柄
一次在一个线程中。

libcurl 是完全线程安全的,
除了两个问题:信号和
SSL/TLS 处理程序。信号用于
名称解析超时(在 DNS 期间)
查找) - 当没有 c-ares 构建时
支持,但不在 Windows 上。

如果您正在访问 HTTPS 或 FTPS
以多线程方式访问 URL,您
那么当然是使用
底层SSL库多线程
这些库可能有自己的
对这个问题的要求。基本上,
您需要提供一两个
功能以使其发挥作用
适当地。有关所有详细信息,请参阅:

OpenSSL

http://www.openssl.org/docs/crypto/threads。 html#描述

GnuTLS

http://www.gnu.org/software/ gnutls/manual/html_node/Multi_002dthreaded-applications.html

NSS

据称已经是线程安全的
不需要任何东西。

亚斯尔

所需操作未知。

当使用多线程时,你应该
将 CURLOPT_NOSIGNAL 选项设置为 1
对于所有手柄。一切都会或
除了超时之外可能工作正常
在 DNS 查找过程中不被尊重
- 您可以通过构建带有 c-ares 支持的 libcurl 来解决这个问题。
c-ares 是一个库,提供
异步名称解析。在一些
平台,libcurl 根本不会
多线程功能正常
除非设置了此选项。

另外,请注意
CURLOPT_DNS_USE_GLOBAL_CACHE 不是
线程安全。

There is a whole section dedicated in libcurl to Multi-Threading.

The first basic rule is that you must
never share a libcurl handle (be it
easy or multi or whatever) between
multiple threads. Only use one handle
in one thread at a time.

libcurl is completely thread safe,
except for two issues: signals and
SSL/TLS handlers. Signals are used for
timing out name resolves (during DNS
lookup) - when built without c-ares
support and not on Windows.

If you are accessing HTTPS or FTPS
URLs in a multi-threaded manner, you
are then of course using the
underlying SSL library multi-threaded
and those libs might have their own
requirements on this issue. Basically,
you need to provide one or two
functions to allow it to function
properly. For all details, see this:

OpenSSL

http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION

GnuTLS

http://www.gnu.org/software/gnutls/manual/html_node/Multi_002dthreaded-applications.html

NSS

is claimed to be thread-safe already
without anything required.

yassl

Required actions unknown.

When using multiple threads you should
set the CURLOPT_NOSIGNAL option to 1
for all handles. Everything will or
might work fine except that timeouts
are not honored during the DNS lookup
- which you can work around by building libcurl with c-ares support.
c-ares is a library that provides
asynchronous name resolves. On some
platforms, libcurl simply will not
function properly multi-threaded
unless this option is set.

Also, note that
CURLOPT_DNS_USE_GLOBAL_CACHE is not
thread-safe.

二智少女 2024-09-17 18:50:20

正如错误:longjmp导致未初始化的堆栈帧中提到的,最新的libcurl版本( >= 7.32.0) 在 Debian/Ubuntu 存储库中包含一个新的多线程解析器来解决这些问题。 c-ares 支持不是一个好的解决方案:

https:// /bugs.debian.org/cgi-bin/bugreport.cgi?bug=570436#74

“真正的问题是 c-ares 还没有完全替代 gethostby* 函数(例如,它不支持多播DNS)并在库 libcurl 包中启用它可能不是一个好的举措(请注意,这些是curl和c-ares的上游作者的话,而不是我的)。”-

As mentioned in error: longjmp causes uninitialized stack frame, the latest libcurl versions (>= 7.32.0) in Debian/Ubuntu repositories contain a new multithreaded resolver to solve these problems. The c-ares support is not a good solution:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=570436#74

"The real problem is that c-ares is not yet a full replacement for gethostby* functions (e.g. it does not support multicast DNS) and enabling it in stock libcurl packages may not be a good move (note that these are words of the upstream author of both curl and c-ares, not mine)."-

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文