当前位置：文江博客话题详情

realloc 调用会带来多少开销？

发布于 2024-10-27 07:21:34 字数 128 浏览 5 评论 0原文

我在迭代次数超过 10000 次的 for 循环的每次迭代中都使用了 realloc。

这是一个好的做法吗？如果多次调用realloc会导致错误吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

弃爱 2024-11-03 07:21:34

除非您耗尽内存（任何其他分配器也会发生这种情况），否则它不会失败 - 但如果您设法预先估计所需的存储空间，您的代码通常会运行得更快。

通常，最好进行一次额外的循环运行来确定存储要求。

我不会说 realloc 是不行的，但这也不是一个好的做法。

回复收藏 0 原文

物价感观 2024-11-03 07:21:34

我最近偶然发现了这个问题，虽然它已经很老了，但我觉得这些信息并不完全准确。

关于使用额外的循环来预先确定需要多少字节的内存，

使用额外的循环并不总是更好，甚至通常更好。预先确定需要多少内存涉及什么？这可能会产生昂贵且不必要的额外 I/O。

关于一般使用 realloc，

alloc 系列函数（malloc、calloc、realloc 和 free）非常高效。底层分配系统从操作系统分配一大块，然后根据请求将部分传递给用户。连续调用 realloc 几乎肯定只会在当前内存位置上添加额外的空间。

如果系统从一开始就更有效、更正确地为您维护堆池，您就不想自己维护堆池。

回复收藏 0 原文

九歌凝 2024-11-03 07:21:34

如果你这样做，你就会面临记忆碎片的风险。这会导致性能下降，并且对于 32 位系统，由于缺乏大的连续内存块的可用性，可能会导致内存短缺。

我猜你每次都会将数组的长度增加 1。如果是这样，那么您最好跟踪容量和长度，并且仅在需要超过当前容量的长度时才增加容量。当您增加容量时，请增加大于 1 的量。

当然，标准容器会为您做这种事情，因此如果您可以使用它们，最好这样做。

回复收藏 0 原文

呆° 2024-11-03 07:21:34

除了前面所说的之外，还有一些事情需要考虑：

realloc(, X + inc) 的性能取决于两件事：

的速度>malloc(N + inc) 通常会降低到 O(N)，分配块的大小和
memcpy(newbuf, oldbuf, N) 的速度> 这也是 O(N) 与块的大小

这意味着对于小增量但大现有块，realloc( ) 相对于现有数据块的大小，性能为 O(N^2)。想想冒泡排序与快速排序……

如果你从一个小块开始，它相对便宜，但如果要重新分配的块很大，则会对你造成很大的惩罚。为了缓解这种情况，您应该确保 inc 相对于现有大小不小；以恒定量重新分配会导致性能问题。

此外，即使您以较大的增量增长（例如，将新大小缩放为旧大小的 150%），重新分配大缓冲区也会导致内存使用量激增；在复制现有内容期间，您使用了两倍的内存量。一系列：

addr = malloc(N);
addr = realloc(addr, N + inc);

因此，失败（远）早于：

addr[0] = malloc(N);
addr[1] = malloc(inc);

有一些数据结构不需要 realloc() 增长；链接列表、跳跃列表、区间树都可以附加数据，而无需复制现有数据。 C++ vector<> 以这种方式增长，它以初始大小的数组开始，如果增长超过这个值，就会继续追加，但它不会realloc()（即复制）。考虑实现（或使用预先存在的实现）类似的东西。

In addition to what's being said before, there's a few more things to consider:

Performance of realloc(<X-sized-buf>, X + inc) depends on two things:

the speed of malloc(N + inc) which usually degrades towards O(N) with the size of the allocated block
the speed of memcpy(newbuf, oldbuf, N) which is also O(N) with the size of the block

That means for small increments but large existing blocks, realloc() performance is O(N^2) with respect to the size of the existing data block. Think bubblesort vs. quicksort ...

It's comparatively cheap if you start with a small block but will significantly punish you if the to-be-reallocated block is large. To mitigate, you should make sure that inc is not small relative to the existing size; realloc'ing by a constant amount is a recipe for performance problems.

Additionally, even if you grow in large increments (say, scale the new size to be 150% of the old), there's the memory usage spike from realloc'ing a large buffer; during the copy of the existing contents you use twice the amount of memory. A sequence of:

addr = malloc(N);
addr = realloc(addr, N + inc);

therefore fails (much) sooner than:

addr[0] = malloc(N);
addr[1] = malloc(inc);

There are data structures out there which do not require realloc() to grow; linked lists, skip lists, interval trees all can append data without having to copy existing data. C++ vector<> grows in this fashion, it starts with an array for the initial size, and keeps on appending if you grow it beyond that, but it won't realloc() (i.e. copy). Consider implementing (or using a preexisting implementation of) something like that.

回复收藏 0 原文

煮酒 2024-11-03 07:21:34

您应该重新分配 2 的幂的大小。这是 stl 使用的策略，并且由于内存管理方式而很好。
realloc 不会失败，除非内存不足（并将返回 NULL），但会将现有（旧）数据复制到新位置，这可能是性能问题。

回复收藏 0 原文

太傻旳人生 2024-11-03 07:21:34

在C中：

如果使用得当，realloc没有任何问题。也就是说，很容易错误地使用它。请参阅编写可靠的代码以深入了解讨论了所有搞乱调用 realloc 的方法以及它在调试时可能导致的额外复杂性。

如果您发现自己一次又一次地重新分配相同的缓冲区，并且只增加了一点点大小，请注意，分配比您需要的空间更多的空间通常会更有效，然后跟踪实际使用的空间。如果超出了分配的空间，请分配一个更大大小的新缓冲区，复制内容并释放旧缓冲区。

在 C++ 中：

您可能应该避免使用 realloc（以及 malloc 和 free）。只要有可能，就使用标准库中的容器类（例如，std::vector）。它们经过充分测试和优化，可以减轻您正确管理内存的许多内务细节的负担（例如处理异常）。

C++ 没有重新分配现有缓冲区的概念。相反，将以新大小分配新缓冲区，复制内容，并删除旧缓冲区。这就是 realloc 在无法满足现有位置的新大小时所做的事情，这使得 C++ 的方法看起来效率较低。但 realloc 很少能真正利用就地重新分配的优势。标准 C++ 容器非常聪明，能够以最小化碎片的方式进行分配，并在多次更新之间分摊成本，因此，如果您的目标是提高性能，那么通常不值得花费精力进行重新分配。

回复收藏 0 原文

蓝海似她心 2024-11-03 07:21:34

我想我应该在这次讨论中添加一些经验数据。

一个简单的测试程序：

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    void *buf = NULL, *new;
    size_t len;
    int n = 0, cpy = 0;

    for (len = 64; len < 0x100000; len += 64, n++) {
        new = realloc(buf, len);
        if (!new) {
            fprintf(stderr, "out of memory\n");
            return 1;
        }

        if (new != buf) {
            cpy++;
            printf("new buffer at %#zx\n", len);
        }

        buf = new;
    }

    free(buf);
    printf("%d memcpys in %d iterations\n", cpy, n);
    return 0;
}

x86_64 上的 GLIBC 会产生以下输出：

new buffer at 0x40
new buffer at 0x80
new buffer at 0x20940
new buffer at 0x21000
new buffer at 0x22000
new buffer at 0x23000
new buffer at 0x24000
new buffer at 0x25000
new buffer at 0x26000
new buffer at 0x4d000
new buffer at 0x9b000
11 memcpys in 16383 iterations

musl on x86_64:

new buffer at 0x40
new buffer at 0xfc0
new buffer at 0x1000
new buffer at 0x2000
new buffer at 0x3000
new buffer at 0x4000
new buffer at 0xa000
new buffer at 0xb000
new buffer at 0xc000
new buffer at 0x21000
new buffer at 0x22000
new buffer at 0x23000
new buffer at 0x66000
new buffer at 0x67000
new buffer at 0xcf000
15 memcpys in 16383 iterations

因此，看起来您通常可以依靠 libc 来处理不跨越页面边界的大小调整，而无需复制缓冲区。

在我看来，除非您能找到一种方法来使用完全避免复制的数据结构，否则请跳过应用程序中的 track-capacity-and-do-power-of-2-resizes 方法，并让您的 libc 执行以下操作：对你来说很繁重。

I thought I would add some empirical data to this discussion.

A simple test program:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    void *buf = NULL, *new;
    size_t len;
    int n = 0, cpy = 0;

    for (len = 64; len < 0x100000; len += 64, n++) {
        new = realloc(buf, len);
        if (!new) {
            fprintf(stderr, "out of memory\n");
            return 1;
        }

        if (new != buf) {
            cpy++;
            printf("new buffer at %#zx\n", len);
        }

        buf = new;
    }

    free(buf);
    printf("%d memcpys in %d iterations\n", cpy, n);
    return 0;
}

GLIBC on x86_64 yields this output:

new buffer at 0x40
new buffer at 0x80
new buffer at 0x20940
new buffer at 0x21000
new buffer at 0x22000
new buffer at 0x23000
new buffer at 0x24000
new buffer at 0x25000
new buffer at 0x26000
new buffer at 0x4d000
new buffer at 0x9b000
11 memcpys in 16383 iterations

musl on x86_64:

new buffer at 0x40
new buffer at 0xfc0
new buffer at 0x1000
new buffer at 0x2000
new buffer at 0x3000
new buffer at 0x4000
new buffer at 0xa000
new buffer at 0xb000
new buffer at 0xc000
new buffer at 0x21000
new buffer at 0x22000
new buffer at 0x23000
new buffer at 0x66000
new buffer at 0x67000
new buffer at 0xcf000
15 memcpys in 16383 iterations

So it looks like you can usually rely on libc to handle resizes that do not cross page boundaries without having to copy the buffer.

The way I see it, unless you can find a way to use a data structure that avoids the copies altogether, skip the track-capacity-and-do-power-of-2-resizes approach in your application and let your libc do the heavy-lifting for you.

回复收藏 0 原文