分配字符串数组的最快方法
我有一个函数,它接受一个字符串数组(缓冲区)并且需要增加其大小。 所以我调用了 realloc
temp = (char**) realloc (buffer, newSize * (sizeof(char*)));
if (temp == NULL)
return false;
else
buffer = temp;
到目前为止一切都很好。现在,对于每个新单元格,我必须调用具有正确大小的 malloc。请注意,newSize 始终为偶数,并且奇数字符串的长度与偶数字符串的长度不同。
for (i = oldSize; i < newSize; i++){
support = (char*) malloc (LENGTH1 * sizeof(char));
if (support == NULL){
marker = i;
failedMalloc = true;
break;
}
else
buffer[i] = support;
i++;
support = (char*) malloc (LENGTH2 * sizeof(char));
if (support == NULL){
marker = i;
failedMalloc = true;
break;
}
else
buffer[i] = support;
}
事实是,由于我迟早要处理大量数据,因此我迟早会耗尽内存,并且重新分配或其中一个 malloc 将失败。问题是,如果它是其中一个 malloc 失败的,则存在我必须调用数百万空闲来清理一些内存的风险。这需要很多时间。有没有什么方法可以加速或者更好地避免这个过程?
if (failedMalloc){
for (i = oldRows; i < marker; i++)
free(buffer[i]);
temp = (char**) realloc (buffer, oldRows * (sizeof(char*)));
}
PS:是的,我知道指针算术比数组索引更快。当我找到解决这个问题的方法时,我会实现它,目前我更喜欢使用数组索引,因为我发现它不太容易出错。但最终版本将使用指针运算
I have a function which takes an array of strings (buffer) and needs to increase its size.
So I invoke a realloc
temp = (char**) realloc (buffer, newSize * (sizeof(char*)));
if (temp == NULL)
return false;
else
buffer = temp;
And thus far everything is fine. Now for every new cell I must invoke a malloc with the correct size. Notice that newSize is always even and that odd strings have a different length than even ones.
for (i = oldSize; i < newSize; i++){
support = (char*) malloc (LENGTH1 * sizeof(char));
if (support == NULL){
marker = i;
failedMalloc = true;
break;
}
else
buffer[i] = support;
i++;
support = (char*) malloc (LENGTH2 * sizeof(char));
if (support == NULL){
marker = i;
failedMalloc = true;
break;
}
else
buffer[i] = support;
}
The fact is that since I work with huge data sooner or later I'll finish memory and the realloc or one of the mallocs will fail. The problem is that if it's one of the mallocs the one that fails there is the risk that I'll have to invoke millions of free to clear up some memory. This takes a lot of time. Is there any way to speedup this process or even better avoid it?
if (failedMalloc){
for (i = oldRows; i < marker; i++)
free(buffer[i]);
temp = (char**) realloc (buffer, oldRows * (sizeof(char*)));
}
PS: Yes I know that pointer arithmetic is faster than array indexing. I will implement it when I find a way to solve this problem, for the moment I prefer using array indexing because I find it less error prone. But the final version will use pointer arithmetic
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
不要单独分配每个字符串,而是将它们分块分配。例如,您可以使用 malloc 128*(LENGTH1+LENGTH2) 并为 256 个连续字符串留出空间。每当索引跨越块边界时,就会分配另一个大块并使用模算术来获取字符串开头的偏移量。
PS sizeof(char) 保证为 1。
Instead of allocating each string individually, allocate them in blocks. You could for example malloc 128*(LENGTH1+LENGTH2) and have room for 256 consecutive strings. Whenever your index crosses a block boundary, malloc another big block and use modulo arithmetic to get an offset into it for the start of the string.
P.S. sizeof(char) is guaranteed to be 1.
分配更大的内存块。 malloc 调用越少越好。最快的方法是预先计算所需的大小并仅分配一次。
此外,使用指针算术不会产生任何明显的差异。
Allocate larger blocks of memory. The less malloc calls, the better. The fastest will be to precalculate the required size and allocate only once.
Also, using pointer arithmetic will not produce any visible difference here.
您可以编写自己的分配和释放例程,并使用它们代替
malloc/free
来存储字符串。如果您的例程malloc
一个或多个大缓冲区并分配其中的一小部分,那么您只需在每个大缓冲区上调用free
即可一次性释放整个缓冲区。在所有分配大小相同的情况下,总体思想尤其有效,在这种情况下,它被称为“池分配器”。在这种情况下,对于每个数组或字符串,您可以有一个用于
LENGTH1
分配的关联池,另一个用于LENGTH2
分配。我说,“编写你自己的”,但毫无疑问,有简单的开源池分配器可供使用。
You could write your own allocation and deallocation routines, and use them instead of
malloc/free
for the strings. If your routinesmalloc
one or more big buffers and portion out little bits of it, then you can free the whole lot in one go just by callingfree
on each big buffer.The general idea works especially well in the case where all allocations are the same size, in which case it's called a "pool allocator". In this case, for each array or strings you could have one associated pool for the
LENGTH1
allocations, and another for theLENGTH2
.I say, "write your own", but no doubt there are simple open-source pool allocators out there for the taking.
避免浪费内存的一种方法是每次 malloc 更大的内存,当需要 malloc 时,
malloc 固定大小(对齐到 2^n),例如
One way to avoid waste memory is to malloc larger memory each time, when you need to malloc,
malloc fixed size(align to 2^n), e.g.