我是否发现了 libxml2 bug(多线程解析中的内存泄漏)?
我实际上正在使用 libxml2 编写数据处理代码。我陷入了无法消除的内存泄漏。下面是生成它的最小代码:
#include <stdlib.h>
#include <stdio.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
#include <omp.h>
int main(void)
{
xmlDoc *doc;
int tn;
char fname[32];
omp_set_num_threads(2);
xmlInitParser();
#pragma omp parallel private(doc,tn,fname)
{
tn = omp_get_thread_num();
sprintf(fname,"testdoc%d.xml",tn);
doc = xmlReadFile(fname,NULL,0);
printf("document %s parsed on thread %d (%p)\n",fname,tn,doc);
xmlFreeDoc(doc);
}
xmlCleanupParser();
return EXIT_SUCCESS;
}
在运行时,输出是:
document testdoc0.xml parsed on thread 0 (0x1005413a0)
document testdoc1.xml parsed on thread 1 (0x1005543c0)
确认我们确实具有多线程,并且 doc
在并行区域中确实是私有的。人们可以注意到,我正确应用了使用 libxml2 的线程安全指令(http://xmlsoft.org/threads.html< /a>)。 Valgrind 报告:
HEAP SUMMARY:
in use at exit: 9,000 bytes in 8 blocks
total heap usage: 956 allocs, 948 frees, 184,464 bytes allocated
968 bytes in 1 blocks are definitely lost in loss record 6 of 8
at 0x1000107AF: malloc (vg_replace_malloc.c:236)
by 0x1000B2590: xmlGetGlobalState (in /opt/local/lib/libxml2.2.dylib)
by 0x1000B1A18: __xmlDefaultSAXHandler (in /opt/local/lib/libxml2.2.dylib)
by 0x100106D18: xmlDefaultSAXHandlerInit (in /opt/local/lib/libxml2.2.dylib)
by 0x100041BE7: xmlInitParserCtxt (in /opt/local/lib/libxml2.2.dylib)
by 0x100042145: xmlNewParserCtxt (in /opt/local/lib/libxml2.2.dylib)
by 0x10004615E: xmlCreateURLParserCtxt (in /opt/local/lib/libxml2.2.dylib)
by 0x10005B56B: xmlReadFile (in /opt/local/lib/libxml2.2.dylib)
by 0x100000E03: main.omp_fn.0 (in ./xtest)
by 0x100028FA3: gomp_thread_start (in /opt/local/lib/gcc44/libgomp.1.dylib)
by 0x1001E8535: _pthread_start (in /usr/lib/libSystem.B.dylib)
by 0x1001E83E8: thread_start (in /usr/lib/libSystem.B.dylib)
LEAK SUMMARY:
definitely lost: 968 bytes in 1 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 8,032 bytes in 7 blocks
suppressed: 0 bytes in 0 blocks
Reachable blocks (those to which a pointer was found) are not shown.
To see them, rerun with: --leak-check=full --show-reachable=yes
无论使用什么 XML 文档,这都对我有用。我在 Mac OS X 10.6.5 上使用 libxml 2.7.8 和 gcc 4.4.5。
有人能够重现这个错误吗?
谢谢,
安东尼
I am working actually on a data processing code using libxml2. I am stuck on a memory leak impossible to remove . Here is a minimal code to generate it :
#include <stdlib.h>
#include <stdio.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
#include <omp.h>
int main(void)
{
xmlDoc *doc;
int tn;
char fname[32];
omp_set_num_threads(2);
xmlInitParser();
#pragma omp parallel private(doc,tn,fname)
{
tn = omp_get_thread_num();
sprintf(fname,"testdoc%d.xml",tn);
doc = xmlReadFile(fname,NULL,0);
printf("document %s parsed on thread %d (%p)\n",fname,tn,doc);
xmlFreeDoc(doc);
}
xmlCleanupParser();
return EXIT_SUCCESS;
}
At runtime, output is :
document testdoc0.xml parsed on thread 0 (0x1005413a0)
document testdoc1.xml parsed on thread 1 (0x1005543c0)
confirming that we really have multi-threading and that doc
is really private in the parallel region. One can notice that I applied correctly the thread safety instructions for using libxml2 (http://xmlsoft.org/threads.html). Valgrind reports :
HEAP SUMMARY:
in use at exit: 9,000 bytes in 8 blocks
total heap usage: 956 allocs, 948 frees, 184,464 bytes allocated
968 bytes in 1 blocks are definitely lost in loss record 6 of 8
at 0x1000107AF: malloc (vg_replace_malloc.c:236)
by 0x1000B2590: xmlGetGlobalState (in /opt/local/lib/libxml2.2.dylib)
by 0x1000B1A18: __xmlDefaultSAXHandler (in /opt/local/lib/libxml2.2.dylib)
by 0x100106D18: xmlDefaultSAXHandlerInit (in /opt/local/lib/libxml2.2.dylib)
by 0x100041BE7: xmlInitParserCtxt (in /opt/local/lib/libxml2.2.dylib)
by 0x100042145: xmlNewParserCtxt (in /opt/local/lib/libxml2.2.dylib)
by 0x10004615E: xmlCreateURLParserCtxt (in /opt/local/lib/libxml2.2.dylib)
by 0x10005B56B: xmlReadFile (in /opt/local/lib/libxml2.2.dylib)
by 0x100000E03: main.omp_fn.0 (in ./xtest)
by 0x100028FA3: gomp_thread_start (in /opt/local/lib/gcc44/libgomp.1.dylib)
by 0x1001E8535: _pthread_start (in /usr/lib/libSystem.B.dylib)
by 0x1001E83E8: thread_start (in /usr/lib/libSystem.B.dylib)
LEAK SUMMARY:
definitely lost: 968 bytes in 1 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 8,032 bytes in 7 blocks
suppressed: 0 bytes in 0 blocks
Reachable blocks (those to which a pointer was found) are not shown.
To see them, rerun with: --leak-check=full --show-reachable=yes
This is working for me whatever the XML document used. I am using libxml 2.7.8, on Mac OS X 10.6.5 with gcc 4.4.5.
Is someone able to reproduce this bug ?
Thanks,
Antonin
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
从上面列出的网站 (http://xmlsoft.org/threads.html):
您的示例似乎对每个线程的同一文档(testdoc.xml)使用 xmlReadFile。它进一步指出:
From the web site you listed above (http://xmlsoft.org/threads.html):
Your example seems to be using an xmlReadFile for the same document (testdoc.xml) for each thread. It further states:
您可能应该在 libxml2 邮件列表中提出这个问题。
http://mail.gnome.org/mailman/listinfo/xml
You should probably bring this up on the libxml2 mailing list.
http://mail.gnome.org/mailman/listinfo/xml