Valgrind/massif未检测到Pthread TLS灾难中的内存释放

发布于 2025-02-11 04:37:28 字数 1833 浏览 2 评论 0原文

症状: 我将TLS密钥分配给destructor,创建一束线程,然后将TLS键传递给每个线程。每个线程都会分配内存并将其指针设置为TLS,TLS Destructor会处理内存。我等待线程在App退出之前完成。 该应用程序是在Valgrind/massif下运行的,该应用程序报告该内存未划分。

int main(int argc, char **argv)
{
  pthread_key_t* key = new pthread_key_t();
  pthread_key_create(key, my_destructor);

  pthread_t threads[32000];

  for(int i=0; i<32000; ++i)
    pthread_create(&threads[i], NULL, my_thread, key);

  for(int i=0; i<32000; ++i)
    pthread_join(threads[i], NULL);

  return 0;
}

在线程跑步者中,我分配内存并在TLS中进行设置:

extern "C" void* my_thread(void* p)
{
  pthread_setspecific(*(pthread_key_t*)p, malloc(100));

  return NULL;
}

在TLS Destructor中,我发布内存:

extern "C" void my_destructor(void *p)
{
  free(p);
}

我在Valgrind/massif 3.19下运行以下选项:

  --tool=massif
  --heap=yes
  --pages-as-heap=yes
  --log-file=/tmp/my.log
  --massif-out-file=/tmp/my.massif.log

然后我运行MS_PRINT/TMP/MY MY MY MY MY .massif.log。我收到的泄漏如下:

| ->01.75% (67,108,864B) 0x76F92D0: new_heap (in /usr/lib64/libc-2.17.so)
| | ->01.75% (67,108,864B) 0x76F98D3: arena_get2.isra.3 (in /usr/lib64/libc-2.17.so)
| |   ->01.75% (67,108,864B) 0x76FF77D: malloc (in /usr/lib64/libc-2.17.so)
| |     ->01.75% (67,108,864B) 0x410300: my_thread (threadsT.cpp:136)
| |       ...
| |       <skipped by author>
| |       ...
| |             
| ->00.00% (73,728B) in 1+ places, all below ms_print's threshold (01.00%)

...虽然我不希望报告的任何东西泄漏。

我将仪器添加到my_destructor中,并手动验证了:

  • 它被调用了,确实
  • 使内存交易了,因为应该做

一些我在这里很明显做错了,这会使Valgrind/Massif报告这些? 是从TLS驱动器调用时无法检测到内存DEATLOCATION的阀/地段限制吗?

GCC 4.9.4Red Hat Enterprise Linux服务器版本7.9(Maipo)上构建和运行。

Symptoms:
I allocate TLS key with a destructor, create a bundle of threads and pass the TLS key to each thread. Each thread allocates memory and sets its pointer in TLS, the TLS destructor deallocates memory. I wait for threads to finish before app exits.
The app is run under valgrind/massif that reports this memory not deallocated.

int main(int argc, char **argv)
{
  pthread_key_t* key = new pthread_key_t();
  pthread_key_create(key, my_destructor);

  pthread_t threads[32000];

  for(int i=0; i<32000; ++i)
    pthread_create(&threads[i], NULL, my_thread, key);

  for(int i=0; i<32000; ++i)
    pthread_join(threads[i], NULL);

  return 0;
}

In the thread runner I allocate the memory and set it up in the TLS:

extern "C" void* my_thread(void* p)
{
  pthread_setspecific(*(pthread_key_t*)p, malloc(100));

  return NULL;
}

In the TLS destructor, I release the memory:

extern "C" void my_destructor(void *p)
{
  free(p);
}

I run this under valgrind/massif 3.19 with the following options:

  --tool=massif
  --heap=yes
  --pages-as-heap=yes
  --log-file=/tmp/my.log
  --massif-out-file=/tmp/my.massif.log

Then I run ms_print /tmp/my.massif.log. I am getting the leaks reported like the following:

| ->01.75% (67,108,864B) 0x76F92D0: new_heap (in /usr/lib64/libc-2.17.so)
| | ->01.75% (67,108,864B) 0x76F98D3: arena_get2.isra.3 (in /usr/lib64/libc-2.17.so)
| |   ->01.75% (67,108,864B) 0x76FF77D: malloc (in /usr/lib64/libc-2.17.so)
| |     ->01.75% (67,108,864B) 0x410300: my_thread (threadsT.cpp:136)
| |       ...
| |       <skipped by author>
| |       ...
| |             
| ->00.00% (73,728B) in 1+ places, all below ms_print's threshold (01.00%)

...while I would not expect anything reported leaked at all.

I added the instrumentation to my_destructor and manually verified that:

  • it is invoked, indeed
  • it deallocates the memory, as it is supposed to do

Is there something apparent I am doing wrong here that makes valgrind/massif report these?
Is it a valgrind/massif limitation that it cannot detect the memory deallocation when invoked from TLS destructors?

Building and running that with gcc 4.9.4 on Red Hat Enterprise Linux Server release 7.9 (Maipo).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

慕烟庭风 2025-02-18 04:37:28

第二个答案,这次集中在“泄漏”方面。

massif实际上并不是泄漏检测器。它用于分析堆的使用。

如果我编译了示例(带有320个线程),那么最后,我将获得约8900万个字节。 由75%的竞技场组成的

Malloc使用的竞技场是从start_thread
9%pthread_create
15%加载共享库,

这对我来说似乎都不是一个问题。我假设start_thread内存是pthread堆栈缓存。

如果我使用massif分析malloc/new,那么最后一个样本是

  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
73      2,929,610            2,360            2,308            52            0

A second answer, this time concentrating on the 'leak' aspect.

Massif isn't really a leak detector. It's for profiling heap use.

If I compile the example (with 320 threads) then at the end I get about 89 million bytes still allocated. That is made up of

75% the arena used by malloc called from start_thread
9% pthread_create
15% loading shared libraries

None of that looks like much of a concern to me. I assume that the start_thread memory is the pthread stack cache.

If I use massif for profiling malloc/new, then the last sample is

  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
73      2,929,610            2,360            2,308            52            0
撩人痒 2025-02-18 04:37:28

您应该检查线程创建的返回状态。您不太可能成功创建32000个线程。

一些valgrind来源:

coregrind/pub_core_options.h:#define MAX_THREADS_DEFAULT 500
coregrind/m_scheduler/scheduler.c:   VG_(printf)("Use --max-threads=INT to specify a larger number of threads\n"

假设这是AMD64 Linux,我相信默认的pthread堆栈大小为800万。这意味着您需要256 gbytes才能进行堆栈内存。您的机器有那么多吗?

请尝试以下

  1. 用法pthread_attr_setstacksize将堆栈大小设置为pthread_stack_min(16k)。
  2. 运行valgrind,带有-max-threads = 32001,

即使在上面的情况下,您仍然可能会达到一些valgrind限制,例如VG_N_SEGMENTS。

之类的消息

如果您看到“ Valgrind:Fatal:VG_N_Segments” 。
增加它并重建。
退出。

现在

You should check the return status for your thread creation. It's unlikely that you are succeeding in creating 32000 threads.

A bit of Valgrind source:

coregrind/pub_core_options.h:#define MAX_THREADS_DEFAULT 500
coregrind/m_scheduler/scheduler.c:   VG_(printf)("Use --max-threads=INT to specify a larger number of threads\n"

Assuming that this is amd64 Linux, I believe that the default pthread stack size is 8Mbytes. That means you need 256Gbytes for stack memory. Does your machine have that much?

Please try the following

  1. Use pthread_attr_setstacksize to set the stack sizes to PTHREAD_STACK_MIN (16k).
  2. Run Valgrind with --max-threads=32001

Even with the above you may still hit some Valgrind limits such as VG_N_SEGMENTS.

If you see a message like

"Valgrind: FATAL: VG_N_SEGMENTS is too low.
Increase it and rebuild.
Exiting now."

Then you will need to rebuild Valgrind with an increased limit.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文