当前位置：文江博客话题详情

为什么 realloc 会消耗大量内存？

发布于 2024-10-03 08:22:36 字数 3255 浏览 0 评论 0原文

由于源代码，这个问题有点长，我试图尽可能简化。请耐心等待，感谢您的阅读。

我有一个应用程序，它的循环可能会运行数百万次。我不想在该循环中进行数千到数百万次 malloc/free 调用，而是希望预先执行一次 malloc，然后进行数千次调用数百万次 realloc 调用。

但当我使用 realloc 时，我遇到了一个问题，即我的应用程序消耗了几 GB 内存并自行终止。如果我使用malloc，我的内存使用情况就很好。

如果我使用 valgrind 的 memtest 在较小的测试数据集上运行，它会报告 malloc 或 realloc 没有内存泄漏。

我已经验证我将每个 malloc-ed（然后是 realloc-ed）对象与相应的 free 相匹配。

所以，从理论上讲，我没有泄漏内存，只是使用 realloc 似乎消耗了我所有的可用 RAM，我想知道为什么以及我可以采取什么措施来解决这个问题。

我最初拥有的是这样的东西，它使用 malloc 并正常工作：

Malloc code

void A () {
    do {
        B();
    } while (someConditionThatIsTrueForMillionInstances);
}

void B () {
    char *firstString = NULL;
    char *secondString = NULL;
    char *someOtherString;

    /* populate someOtherString with data from stream, for example */

    C((const char *)someOtherString, &firstString, &secondString);

    fprintf(stderr, "first: [%s] | second: [%s]\n", firstString, secondString);

    if (firstString)
        free(firstString);
    if (secondString)
        free(secondString);
}

void C (const char *someOtherString, char **firstString, char **secondString) {
    char firstBuffer[BUFLENGTH];
    char secondBuffer[BUFLENGTH];

    /* populate buffers with some data from tokenizing someOtherString in a special way */

    *firstString = malloc(strlen(firstBuffer)+1);
    strncpy(*firstString, firstBuffer, strlen(firstBuffer)+1);

    *secondString = malloc(strlen(secondBuffer)+1);
    strncpy(*secondString, secondBuffer, strlen(secondBuffer)+1);
}

这工作正常。但我想要更快的东西。

现在我测试一个 realloc 安排，其中 malloc-s 仅一次：

Realloc code

void A () {
    char *firstString = NULL;
    char *secondString = NULL;

    do {
        B(&firstString, &secondString);
    } while (someConditionThatIsTrueForMillionInstances);

    if (firstString)
        free(firstString);
    if (secondString)
        free(secondString);
}

void B (char **firstString, char **secondString) {
    char *someOtherString;

    /* populate someOtherString with data from stream, for example */

    C((const char *)someOtherString, &(*firstString), &(*secondString));

    fprintf(stderr, "first: [%s] | second: [%s]\n", *firstString, *secondString);
}

void C (const char *someOtherString, char **firstString, char **secondString) {
    char firstBuffer[BUFLENGTH];
    char secondBuffer[BUFLENGTH];

    /* populate buffers with some data from tokenizing someOtherString in a special way */

    /* realloc should act as malloc on first pass through */

    *firstString = realloc(*firstString, strlen(firstBuffer)+1);
    strncpy(*firstString, firstBuffer, strlen(firstBuffer)+1);

    *secondString = realloc(*secondString, strlen(secondBuffer)+1);
    strncpy(*secondString, secondBuffer, strlen(secondBuffer)+1);
}

如果我查看 free -m 的输出当我使用导致百万循环条件的大型数据集运行基于 realloc 的测试时，在命令行上运行此测试时，我的内存从 4 GB 降至 0，并且应用程序崩溃。

使用 realloc 导致此问题时，我缺少什么？抱歉，如果这是一个愚蠢的问题，并提前感谢您的建议。

原文

This question is a bit long due the source code, which I tried to simplify as much as possible. Please bear with me and thanks for reading along.

I have an application with a loop that runs potentially millions of times. Instead of several thousands to millions of malloc/free calls within that loop, I would like to do one malloc up front and then several thousands to millions of realloc calls.

But I'm running into a problem where my application consumes several GB of memory and kills itself, when I am using realloc. If I use malloc, my memory usage is fine.

If I run on smaller test data sets with valgrind's memtest, it reports no memory leaks with either malloc or realloc.

I have verified that I am matching every malloc-ed (and then realloc-ed) object with a corresponding free.

So, in theory, I am not leaking memory, it is just that using realloc seems to consume all of my available RAM, and I'd like to know why and what I can do to fix this.

What I have initially is something like this, which uses malloc and works properly:

Malloc code

void A () {
    do {
        B();
    } while (someConditionThatIsTrueForMillionInstances);
}

void B () {
    char *firstString = NULL;
    char *secondString = NULL;
    char *someOtherString;

    /* populate someOtherString with data from stream, for example */

    C((const char *)someOtherString, &firstString, &secondString);

    fprintf(stderr, "first: [%s] | second: [%s]\n", firstString, secondString);

    if (firstString)
        free(firstString);
    if (secondString)
        free(secondString);
}

void C (const char *someOtherString, char **firstString, char **secondString) {
    char firstBuffer[BUFLENGTH];
    char secondBuffer[BUFLENGTH];

    /* populate buffers with some data from tokenizing someOtherString in a special way */

    *firstString = malloc(strlen(firstBuffer)+1);
    strncpy(*firstString, firstBuffer, strlen(firstBuffer)+1);

    *secondString = malloc(strlen(secondBuffer)+1);
    strncpy(*secondString, secondBuffer, strlen(secondBuffer)+1);
}

This works fine. But I want something faster.

Now I test a realloc arrangement, which malloc-s only once:

Realloc code

void A () {
    char *firstString = NULL;
    char *secondString = NULL;

    do {
        B(&firstString, &secondString);
    } while (someConditionThatIsTrueForMillionInstances);

    if (firstString)
        free(firstString);
    if (secondString)
        free(secondString);
}

void B (char **firstString, char **secondString) {
    char *someOtherString;

    /* populate someOtherString with data from stream, for example */

    C((const char *)someOtherString, &(*firstString), &(*secondString));

    fprintf(stderr, "first: [%s] | second: [%s]\n", *firstString, *secondString);
}

void C (const char *someOtherString, char **firstString, char **secondString) {
    char firstBuffer[BUFLENGTH];
    char secondBuffer[BUFLENGTH];

    /* populate buffers with some data from tokenizing someOtherString in a special way */

    /* realloc should act as malloc on first pass through */

    *firstString = realloc(*firstString, strlen(firstBuffer)+1);
    strncpy(*firstString, firstBuffer, strlen(firstBuffer)+1);

    *secondString = realloc(*secondString, strlen(secondBuffer)+1);
    strncpy(*secondString, secondBuffer, strlen(secondBuffer)+1);
}

If I look at the output of free -m on the command-line while I run this realloc-based test with a large data set that causes the million-loop condition, my memory goes from 4 GB down to 0 and the app crashes.

What am I missing about using realloc that is causing this? Sorry if this is a dumb question, and thanks in advance for your advice.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

财迷小姐 2024-10-10 08:22:36

如果调整大小操作无法就地完成，realloc 必须将旧缓冲区的内容复制到新缓冲区。如果您不需要保留原始内存，malloc/free 对可能比 realloc 更好。

这就是为什么 realloc 可能暂时比 malloc/free 对需要更多内存。您还通过不断交错 realloc 来鼓励碎片化。即，您基本上在做：

malloc(A);
malloc(B);

while (...)
{
    malloc(A_temp);
    free(A);
    A= A_temp;
    malloc(B_temp);
    free(B);
    B= B_temp;
}

而原始代码是：

while (...)
{
    malloc(A);
    malloc(B);
    free(A);
    free(B);
}

在第二个循环的每个结束时，您已经清理了您使用的所有内存；与交错内存分配而不完全释放所有内存相比，这更有可能将全局内存堆返回到干净状态。

realloc has to copy the contents from the old buffer to the new buffer if the resizing operation cannot be done in place. A malloc/free pair can be better than a realloc if you don't need to keep around the original memory.

That's why realloc can temporarily require more memory than a malloc/free pair. You are also encouraging fragmentation by continuously interleaving reallocs. I.e., you are basically doing:

malloc(A);
malloc(B);

while (...)
{
    malloc(A_temp);
    free(A);
    A= A_temp;
    malloc(B_temp);
    free(B);
    B= B_temp;
}

Whereas the original code does:

while (...)
{
    malloc(A);
    malloc(B);
    free(A);
    free(B);
}

At the end of each of the second loop you have cleaned up all the memory you used; that's more likely to return the global memory heap to a clean state than by interleaving memory allocations without completely freeing all of them.

回复收藏 0 原文