我是否发现了 libxml2 bug(多线程解析中的内存泄漏)?

发布于 2024-10-10 16:21:23 字数 2600 浏览 2 评论 0原文

我实际上正在使用 libxml2 编写数据处理代码。我陷入了无法消除的内存泄漏。下面是生成它的最小代码:

#include <stdlib.h>
#include <stdio.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
#include <omp.h>

int main(void)
{
    xmlDoc *doc;
    int tn;
    char fname[32];

    omp_set_num_threads(2);
    xmlInitParser();
    #pragma omp parallel private(doc,tn,fname)
    {
        tn  = omp_get_thread_num();
        sprintf(fname,"testdoc%d.xml",tn);
        doc = xmlReadFile(fname,NULL,0);
        printf("document %s parsed on thread %d (%p)\n",fname,tn,doc);
        xmlFreeDoc(doc);
    }
    xmlCleanupParser();

    return EXIT_SUCCESS;
}

在运行时,输出是:

document testdoc0.xml parsed on thread 0 (0x1005413a0)
document testdoc1.xml parsed on thread 1 (0x1005543c0)

确认我们确实具有多线程,并且 doc 在并行区域中确实是私有的。人们可以注意到,我正确应用了使用 libxml2 的线程安全指令(http://xmlsoft.org/threads.html< /a>)。 Valgrind 报告:

HEAP SUMMARY:
    in use at exit: 9,000 bytes in 8 blocks
  total heap usage: 956 allocs, 948 frees, 184,464 bytes allocated

968 bytes in 1 blocks are definitely lost in loss record 6 of 8
   at 0x1000107AF: malloc (vg_replace_malloc.c:236)
   by 0x1000B2590: xmlGetGlobalState (in /opt/local/lib/libxml2.2.dylib)
   by 0x1000B1A18: __xmlDefaultSAXHandler (in /opt/local/lib/libxml2.2.dylib)
   by 0x100106D18: xmlDefaultSAXHandlerInit (in /opt/local/lib/libxml2.2.dylib)
   by 0x100041BE7: xmlInitParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x100042145: xmlNewParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x10004615E: xmlCreateURLParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x10005B56B: xmlReadFile (in /opt/local/lib/libxml2.2.dylib)
   by 0x100000E03: main.omp_fn.0 (in ./xtest)
   by 0x100028FA3: gomp_thread_start (in /opt/local/lib/gcc44/libgomp.1.dylib)
   by 0x1001E8535: _pthread_start (in /usr/lib/libSystem.B.dylib)
   by 0x1001E83E8: thread_start (in /usr/lib/libSystem.B.dylib)

LEAK SUMMARY:
   definitely lost: 968 bytes in 1 blocks
   indirectly lost: 0 bytes in 0 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 8,032 bytes in 7 blocks
        suppressed: 0 bytes in 0 blocks
Reachable blocks (those to which a pointer was found) are not shown.
To see them, rerun with: --leak-check=full --show-reachable=yes

无论使用什么 XML 文档,这都对我有用。我在 Mac OS X 10.6.5 上使用 libxml 2.7.8 和 gcc 4.4.5。

有人能够重现这个错误吗?

谢谢,

安东尼

I am working actually on a data processing code using libxml2. I am stuck on a memory leak impossible to remove . Here is a minimal code to generate it :

#include <stdlib.h>
#include <stdio.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
#include <omp.h>

int main(void)
{
    xmlDoc *doc;
    int tn;
    char fname[32];

    omp_set_num_threads(2);
    xmlInitParser();
    #pragma omp parallel private(doc,tn,fname)
    {
        tn  = omp_get_thread_num();
        sprintf(fname,"testdoc%d.xml",tn);
        doc = xmlReadFile(fname,NULL,0);
        printf("document %s parsed on thread %d (%p)\n",fname,tn,doc);
        xmlFreeDoc(doc);
    }
    xmlCleanupParser();

    return EXIT_SUCCESS;
}

At runtime, output is :

document testdoc0.xml parsed on thread 0 (0x1005413a0)
document testdoc1.xml parsed on thread 1 (0x1005543c0)

confirming that we really have multi-threading and that doc is really private in the parallel region. One can notice that I applied correctly the thread safety instructions for using libxml2 (http://xmlsoft.org/threads.html). Valgrind reports :

HEAP SUMMARY:
    in use at exit: 9,000 bytes in 8 blocks
  total heap usage: 956 allocs, 948 frees, 184,464 bytes allocated

968 bytes in 1 blocks are definitely lost in loss record 6 of 8
   at 0x1000107AF: malloc (vg_replace_malloc.c:236)
   by 0x1000B2590: xmlGetGlobalState (in /opt/local/lib/libxml2.2.dylib)
   by 0x1000B1A18: __xmlDefaultSAXHandler (in /opt/local/lib/libxml2.2.dylib)
   by 0x100106D18: xmlDefaultSAXHandlerInit (in /opt/local/lib/libxml2.2.dylib)
   by 0x100041BE7: xmlInitParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x100042145: xmlNewParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x10004615E: xmlCreateURLParserCtxt (in /opt/local/lib/libxml2.2.dylib)
   by 0x10005B56B: xmlReadFile (in /opt/local/lib/libxml2.2.dylib)
   by 0x100000E03: main.omp_fn.0 (in ./xtest)
   by 0x100028FA3: gomp_thread_start (in /opt/local/lib/gcc44/libgomp.1.dylib)
   by 0x1001E8535: _pthread_start (in /usr/lib/libSystem.B.dylib)
   by 0x1001E83E8: thread_start (in /usr/lib/libSystem.B.dylib)

LEAK SUMMARY:
   definitely lost: 968 bytes in 1 blocks
   indirectly lost: 0 bytes in 0 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 8,032 bytes in 7 blocks
        suppressed: 0 bytes in 0 blocks
Reachable blocks (those to which a pointer was found) are not shown.
To see them, rerun with: --leak-check=full --show-reachable=yes

This is working for me whatever the XML document used. I am using libxml 2.7.8, on Mac OS X 10.6.5 with gcc 4.4.5.

Is someone able to reproduce this bug ?

Thanks,

Antonin

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

辞别 2024-10-17 16:21:23

从上面列出的网站 (http://xmlsoft.org/threads.html):

从 2.4.7 开始,libxml2 做出了规定,以确保并发线程可以安全地并行解析不同文档。

您的示例似乎对每个线程的同一文档(testdoc.xml)使用 xmlReadFile。它进一步指出:

注意,多个线程共享同一个文档无法保证线程安全,必须在应用程序级别进行锁定...

From the web site you listed above (http://xmlsoft.org/threads.html):

Starting with 2.4.7, libxml2 makes provisions to ensure that concurrent threads can safely work in parallel parsing different documents.

Your example seems to be using an xmlReadFile for the same document (testdoc.xml) for each thread. It further states:

Note that the thread safety cannot be ensured for multiple threads sharing the same document, the locking must be done at the application level ...

若水般的淡然安静女子 2024-10-17 16:21:23

您可能应该在 libxml2 邮件列表中提出这个问题。

http://mail.gnome.org/mailman/listinfo/xml

You should probably bring this up on the libxml2 mailing list.

http://mail.gnome.org/mailman/listinfo/xml

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文