分配字符串数组的最快方法

发布于 2024-12-08 13:00:57 字数 1202 浏览 0 评论 0原文

我有一个函数,它接受一个字符串数组(缓冲区)并且需要增加其大小。 所以我调用了 realloc

temp = (char**) realloc (buffer, newSize * (sizeof(char*)));
if (temp == NULL)
    return false;
else
    buffer = temp;

到目前为止一切都很好。现在,对于每个新单元格,我必须调用具有正确大小的 malloc。请注意,newSize 始终为偶数,并且奇数字符串的长度与偶数字符串的长度不同。

for (i = oldSize; i < newSize; i++){
    support = (char*) malloc (LENGTH1 * sizeof(char));
    if (support == NULL){
        marker = i;
        failedMalloc = true;
        break;
    }
    else
        buffer[i] = support;

    i++;

    support = (char*) malloc (LENGTH2 * sizeof(char));
    if (support == NULL){
        marker = i;
        failedMalloc = true;
        break;
    }
    else
        buffer[i] = support;

}

事实是,由于我迟早要处理大量数据,因此我迟早会耗尽内存,并且重新分配或其中一个 malloc 将失败。问题是,如果它是其中一个 malloc 失败的,则存在我必须调用数百万空闲来清理一些内存的风险。这需要很多时间。有没有什么方法可以加速或者更好地避免这个过程?

if (failedMalloc){
    for (i = oldRows; i < marker; i++)
        free(buffer[i]);
    temp = (char**) realloc (buffer, oldRows * (sizeof(char*)));
}

PS:是的,我知道指针算术比数组索引更快。当我找到解决这个问题的方法时,我会实现它,目前我更喜欢使用数组索引,因为我发现它不太容易出错。但最终版本将使用指针运算

I have a function which takes an array of strings (buffer) and needs to increase its size.
So I invoke a realloc

temp = (char**) realloc (buffer, newSize * (sizeof(char*)));
if (temp == NULL)
    return false;
else
    buffer = temp;

And thus far everything is fine. Now for every new cell I must invoke a malloc with the correct size. Notice that newSize is always even and that odd strings have a different length than even ones.

for (i = oldSize; i < newSize; i++){
    support = (char*) malloc (LENGTH1 * sizeof(char));
    if (support == NULL){
        marker = i;
        failedMalloc = true;
        break;
    }
    else
        buffer[i] = support;

    i++;

    support = (char*) malloc (LENGTH2 * sizeof(char));
    if (support == NULL){
        marker = i;
        failedMalloc = true;
        break;
    }
    else
        buffer[i] = support;

}

The fact is that since I work with huge data sooner or later I'll finish memory and the realloc or one of the mallocs will fail. The problem is that if it's one of the mallocs the one that fails there is the risk that I'll have to invoke millions of free to clear up some memory. This takes a lot of time. Is there any way to speedup this process or even better avoid it?

if (failedMalloc){
    for (i = oldRows; i < marker; i++)
        free(buffer[i]);
    temp = (char**) realloc (buffer, oldRows * (sizeof(char*)));
}

PS: Yes I know that pointer arithmetic is faster than array indexing. I will implement it when I find a way to solve this problem, for the moment I prefer using array indexing because I find it less error prone. But the final version will use pointer arithmetic

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

殊姿 2024-12-15 13:00:57

不要单独分配每个字符串,而是将它们分块分配。例如,您可以使用 malloc 128*(LENGTH1+LENGTH2) 并为 256 个连续字符串留出空间。每当索引跨越块边界时,就会分配另一个大块并使用模算术来获取字符串开头的偏移量。

PS sizeof(char) 保证为 1。

Instead of allocating each string individually, allocate them in blocks. You could for example malloc 128*(LENGTH1+LENGTH2) and have room for 256 consecutive strings. Whenever your index crosses a block boundary, malloc another big block and use modulo arithmetic to get an offset into it for the start of the string.

P.S. sizeof(char) is guaranteed to be 1.

打小就很酷 2024-12-15 13:00:57

分配更大的内存块。 malloc 调用越少越好。最快的方法是预先计算所需的大小并仅分配一次。

此外,使用指针算术不会产生任何明显的差异。

Allocate larger blocks of memory. The less malloc calls, the better. The fastest will be to precalculate the required size and allocate only once.

Also, using pointer arithmetic will not produce any visible difference here.

顾北清歌寒 2024-12-15 13:00:57

您可以编写自己的分配和释放例程,并使用它们代替 malloc/free 来存储字符串。如果您的例程 malloc 一个或多个大缓冲区并分配其中的一小部分,那么您只需在每个大缓冲区上调用 free 即可一次性释放整个缓冲区。

在所有分配大小相同的情况下,总体思想尤其有效,在这种情况下,它被称为“池分配器”。在这种情况下,对于每个数组或字符串,您可以有一个用于 LENGTH1 分配的关联池,另一个用于 LENGTH2 分配。

我说,“编写你自己的”,但毫无疑问,有简单的开源池分配器可供使用。

You could write your own allocation and deallocation routines, and use them instead of malloc/free for the strings. If your routines malloc one or more big buffers and portion out little bits of it, then you can free the whole lot in one go just by calling free on each big buffer.

The general idea works especially well in the case where all allocations are the same size, in which case it's called a "pool allocator". In this case, for each array or strings you could have one associated pool for the LENGTH1 allocations, and another for the LENGTH2.

I say, "write your own", but no doubt there are simple open-source pool allocators out there for the taking.

慢慢从新开始 2024-12-15 13:00:57

避免浪费内存的一种方法是每次 malloc 更大的内存,当需要 malloc 时,

malloc 固定大小(对齐到 2^n),例如

int m_size = 1;

// when need malloc
while (m_size < require_size) m_size * 2;
malloc(m_size);

One way to avoid waste memory is to malloc larger memory each time, when you need to malloc,

malloc fixed size(align to 2^n), e.g.

int m_size = 1;

// when need malloc
while (m_size < require_size) m_size * 2;
malloc(m_size);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文