高效内存重新分配问题
假设我有一个程序(例如 C++)分配多个对象,但永远不会大于给定的大小(我们称之为 MAX_OBJECT_SIZE)。
我在堆上还有一个区域(我将其称为“页面”)(用 malloc(REGION_SIZE) 分配,其中 REGION_SIZE >= MAX_OBJECT_SIZE)。
我继续在该页面中保留空间,直到填充的空间等于 PAGE_SIZE(或至少获得 > PAGE_SIZE - MAX_OBJECT_SIZE)。
现在,我想分配更多内存。显然我之前的“页面”还不够。所以我至少有两个选择:
- 使用 realloc(page, NEW_SIZE),其中 NEW_SIZE >页大小;
- 分配一个新的“页面”(page2)并将新对象放在那里。
如果我想要一个自定义分配函数,那么:
- 使用第一种方法,我会看到我填充了多少,然后将我的新对象放在那里(并将对象的大小添加到我填充的内存变量中)。
- 使用第二种方法,我有一个页面列表(向量?数组?),然后查找当前页面,然后在所选页面上使用类似于 1 的方法。
最终,我也需要一种释放内存的方法,但我可以弄清楚那部分。
所以我的问题是:解决此类问题最有效的方法是什么?是选项 1、选项 2 还是其他我没有考虑过的选项?是否需要一个小基准/足以对现实世界的情况得出结论? 我知道不同的操作可能会有不同的表现,但我正在寻找一个总体指标。
Let's say I have a program(C++, for example) that allocates multiple objects, never bigger than a given size(let's call it MAX_OBJECT_SIZE).
I also have a region(I'll call it a "page") on the heap(allocated with, say, malloc(REGION_SIZE), where REGION_SIZE >= MAX_OBJECT_SIZE).
I keep reserving space in that page until the filled space equals PAGE_SIZE(or at least gets > PAGE_SIZE - MAX_OBJECT_SIZE).
Now, I want to allocate more memory. Obviously my previous "page" won't be enough. So I have at least two options:
- Use realloc(page, NEW_SIZE), where NEW_SIZE > PAGE_SIZE;
- Allocate a new "page"(page2) and put the new object there.
If I wanted to have a custom allocate function, then:
- Using the first method, I'd see how much I had filled, and then put my new object there(and add the size of the object to my filled memory variable).
- Using the second method, I'd have a list(vector? array?) of pages, then look for the current page, and then use a method similar to 1 on the selected page.
Eventually, I'd need a method to free memory too, but I can figure out that part.
So my question is: What is the most efficient way to solve a problem like this? Is it option 1, option 2 or some other option I haven't considered here? Is a small benchmark needed/enough to draw conclusions for real-world situations?
I understand that different operations may perform differently, but I'm looking for an overall metric.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
从您的问题中不清楚为什么您需要提前分配一大块内存,而不是根据需要为每个对象分配内存。我假设您将其用作连续数组。否则,根据需要
malloc
每个对象的内存会更有意义。如果它确实充当数组,
malloc
-ing另一个块会为您提供另一块内存,您必须通过另一个指针访问它(在您的情况下page2
)。因此,它不再位于连续的块上,并且您不能将这两个块用作一个数组的一部分。另一方面,realloc 分配一个连续的内存块。您可以将其用作单个数组,并执行各种指针算术,如果存在单独的块,则这是不可能的。当您确实想要缩小正在使用的块时,realloc 也很有用,但这可能不是您在这里想要做的。
因此,如果您将其用作数组,
realloc
基本上是更好的选择。否则,malloc
没有任何问题。实际上,您可能希望对创建的每个对象使用 malloc,而不必跟踪和微观管理内存块。It is not clear from your question why you need to allocate a big block of memory in advance rather than allocating memory for each object as needed. I'm assuming you are using it as a contiguous array. Otherwise, it would make more sense to
malloc
the memory of each object as it is needed.If it is indeed acting as an array,
malloc
-ing another block gives you another chunk of memory that you have to access via another pointer (in your casepage2
). Thus it is no longer on contiguous block and you cannot use the two blocks as part of one array.realloc
, on the other hand, allocates one contiguous block of memory. You can use it as a single array and do all sorts of pointer arithmetic not possible if there are separate blocks.realloc
is also useful when you actually want to shrink the block you are working with, but that is probably not what you are seeking to do here.So, if you are using this as an array,
realloc
is basically the better option. Otherwise, there is nothing wrong withmalloc
. Actually, you might want to usemalloc
for each object you create rather than having to keep track of and micro-manage blocks of memory.您没有提供有关您正在试验的平台的任何详细信息。例如,
Linux
和Windows
之间的realloc
存在一些性能差异。根据具体情况,如果
realloc
无法增长当前内存块并复制内存块,则可能必须分配一个新内存块。旧内存换成新内存,这是昂贵的。如果您确实不需要连续内存块,则应避免使用
realloc
。我的建议是使用第二种方法,或者使用自定义分配器(您可以实现一个简单的 好友分配器< /a> [2])。
您还可以使用更高级的内存分配器,例如
You have not given any details on what platform you are experimenting. There are some performance differences for
realloc
betweenLinux
andWindows
, for example.Depending on the situation,
realloc
might have to allocate a new memory block if it can't grow the current one and copy the old memory to the new one, which is expensive.If you don't really need a contiguous block of memory you should avoid using
realloc
.My sugestion would be to use the second approach, or use a custom allocator (you could implement a simple buddy allocator [2]).
You could also use more advanced memory allocators, like
在最坏的情况下,选项 1 可能会导致原始内存的“移动”,这是需要完成的额外工作。如果内存没有移动,无论如何都会初始化“额外”大小,这也是其他工作。因此,realloc 会被 malloc 方法“击败”,但要说多少,您应该进行测试(我认为内存请求完成时系统的表现存在偏差)。
根据您期望执行 realloc/malloc 的次数,它可能是一个有用的想法,也可能是一个无用的想法。无论如何我都会使用 malloc。
免费策略取决于执行情况。要整体释放所有页面,“遍历”它们就足够了;我将使用链接的“页面”而不是数组:将 sizeof(void *) 添加到“页面”大小,并且您可以使用额外的字节来存储指向下一页的指针。
如果您必须释放位于其中一个页面中任意位置的单个对象,则情况会变得更加复杂。我的想法是保留一个非连续的空闲“块”/“槽”列表(适合容纳任何对象)。当请求新的“块”时,首先从该列表中弹出一个值;如果它是空的,那么您将获得最后使用的页面中的下一个“槽”,并最终触发一个新页面。释放对象意味着将空槽地址放入堆栈/列表中(无论您喜欢使用什么)。
In the worst case, option 1 could cause a "move" of the original memory, that is an extrawork to be done. If the memory is not moved, anyway the "extra" size is initialized, which is other work too. So
realloc
would be "defeated" by the malloc method, but to say how much, you should do tests (and I think there's a bias on how the system is when the memory requests are done).Depending on how many times you expect the realloc/malloc have to be performed, it could be an useful idea or an unuseful one. I would use malloc anyway.
The free strategy depends on the implementation. To free all the pages as whole, it is enough to "traverse" them; instead of an array, I would use linked "pages": add sizeof(void *) to the "page" size, and you can use the extra bytes to store the pointer to the next page.
If you have to free a single object, located anywhere in one of the pages, it becomes a little bit more complex. My idea is to keep a list of non-sequential free "block"/"slot" (suitable to hold any object). When a new "block" is requested, first you pop a value from this list; if it is empty, then you get the next "slot" in the last in use page, and eventually a new page is triggered. Freeing an object, means just to put the empty slot address in a stack/list (whatever you prefer to use).
在 Linux(可能还有其他 POSIX 系统)中,还有第三种可能性,即通过
shm_open
使用内存映射区域。一旦您访问这样的区域,它就会被初始化为零,但据我所知,您从未访问过的页面是免费的,如果它不仅仅是您保留的虚拟内存中的地址范围。因此,您可以在执行开始时保留一大块内存(比您需要的更多),然后从一开始就增量填充它。In linux (and probably other POSIX systems) there is a third possibility, that is to use a memory mapped region with
shm_open
. Such a region is initialized by zeroes once you access it, but AFAIK pages that you never access come with no cost, if it isn't just the address-range in virtual memory that you reserve. So you could just reserve a large chunk of memory at the beginning (more than you ever would need) of your execution and then fill it incrementally from the start.选项 1。为了提高效率,NEW_SIZE 必须非线性地依赖于旧大小。否则,由于冗余复制,您可能会遇到 realloc() 性能 O(n^2) 的风险。我通常会这样做
new_size = old_size + old_size/4
(增加 25%),因为理论上最好的new_size = old_size*2
在最坏的情况下可能会保留太多未使用的内存。选项 2。它应该是更优化的,因为大多数现代操作系统(感谢 C++ 的 STL)已经针对大量小内存分配进行了很好的优化。而且小分配导致内存碎片的机会较小。
最后,这完全取决于您分配新对象的频率以及如何处理释放。如果你用 #1 分配了很多,那么在扩展时你会进行一些冗余复制,但释放非常简单,因为所有对象都在同一页中。如果您需要释放/重用对象,则使用 #2 您将花费一些时间浏览页面列表。
根据我的经验,#2 更好,因为移动大内存块可能会增加堆碎片率。 #2 还允许使用指针,因为对象不会更改其在内存中的位置(尽管对于某些应用程序,我更喜欢使用 pool_id/index 对而不是原始指针)。如果以后浏览页面成为问题,则可能过于优化。
最后你还应该考虑选项#3:libc。我认为 libc 的 malloc() 对于许多任务来说足够高效。请在投入更多时间之前对其进行测试。除非您陷入某些落后的 *NIX,否则对每个小对象使用 malloc() 应该没有问题。仅当我需要将对象放在异国情调的地方(例如 shm 或 mmap)时,我才使用自定义内存管理。还要记住多线程:malloc()/realloc()/free() 通常已经经过优化并支持 MT;您必须重新实现优化,以避免线程在内存管理上不断发生冲突。如果您想要内存池或区域,也已经有很多库可以实现。
Option 1. For it to be efficient, NEW_SIZE has to depend on old size non-linearly. Otherwise you risk running into O(n^2) performance of realloc() due to the redundant copying. I generally do
new_size = old_size + old_size/4
(increase by 25% percent) as theoretically bestnew_size = old_size*2
might in worst case reserve too much unused memory.Option 2. It should be more optimal as most modern OSs (thanks to C++'s STL) are already well optimized for flood of small memory allocations. And small allocations have lesser chance to cause memory fragmentation.
In the end it all depends how often you allocate the new objects and how do you handle freeing. If you allocate a lot with #1 you would have some redundant copying when expanding but freeing is dead simple since all objects are in the same page. If you would need to free/reuse the objects, with #2 you would be spending some time walking through the list of pages.
From my experience #2 is better, as moving around large memory blocks might increase rate of heap fragmentation. The #2 is also allows to use pointers as objects do not change their location in memory (though for some applications I prefer to use pool_id/index pairs instead of raw pointers). If walking through pages becomes a problem later, it can be too optimized.
In the end you should also consider option #3: libc. I think that libc's malloc() is efficient enough for many many tasks. Please test it before investing more of your time. Unless you are stuck on some backward *NIX, there should be no problem using malloc() for every smallish object. I used custom memory management only when I needed to put objects in exotic places (e.g. shm or mmap). Keep in mind the multi-threading too: malloc()/realloc()/free() generally are already optimized and MT-ready; you would have to reimplement the optimizations anew to avoid threads being constantly colliding on memory management. And if you want to have memory pools or zones, there are already bunch of libraries for that too.
根据我的经验,选项 2 更容易使用且开销最小。 Realloc不保证它会增加现有内存的大小。而在实践中几乎从来没有这样做过。如果您使用它,您将需要返回并重新映射所有旧对象。这需要您记住分配的每个对象的位置...这可能会带来大量的开销。
但如果不确切知道您使用的指标,就很难定义“最高效”。
这是我经常使用的内存管理器。它适用于整个应用程序而不仅仅是一个对象。
allocs:
对于每次分配,确定分配的对象的大小。
1 查看该大小的对象的释放链接列表,看看是否有任何内容被释放,如果是,则取出第一个释放的内容
2 在查找表中查找,如果没有找到
2.1 分配一个由 N 个正在分配的大小的对象组成的数组。
3 返回所需大小的下一个自由对象。
3.1 如果数组已满则添加新页面。
程序员可以调整 N 个对象。如果您知道有一百万个 16 字节对象,您可能希望 N 稍高一些。
对于超过某个大小 X 的对象,不要保留数组,而是简单地分配一个新对象。
frees:
确定对象的大小,将其添加到frees的链接列表中。
如果分配的对象的大小小于指针的大小,则链接列表不需要产生任何内存开销。只需使用已分配的内存来存储节点即可。
这种方法的问题是,在应用程序退出或程序员决定对内存进行碎片整理之前,内存永远不会返回到操作系统。碎片整理是另一篇文章。这是可以做到的。
In my experience option 2 is much easier to work with has minimal overhead. Realloc does not guarantee it will increase the size of existing memory. And in practice it almost never does. If you use it you will need to go back and remap all of the old objects. That would require that you remember where every object allocated was... That can be a ton over overhead.
But it's hard to qualify "most efficient" without knowing exactly what metrics you use.
This is the memory manager I always use. It works for the entire application not just one object.
allocs:
for every allocation determine the size of the object allocated.
1 look at a link list of frees for objects of that size to see if anything has been freed if so take the first free
2 look for in a look up table and if not found
2.1 allocate an array of N objects of the size being allocated.
3 return the next free object of the desired size.
3.1 if the array is full add a new page.
N objects can be programmer tunned. If you know you have a million 16 byte objects you might want that N to be slightly higher.
for objects over some size X, do not keep an array simply allocate a new object.
frees:
determine the size of the object, add it to the link list of frees.
if the size of the object allocated is less than the size of a pointer the link list does not need to incur any memory overhead. simply use the already allocated memory to store the nodes.
The problem with this method is memory is never returned to the operating system until the application has exited or the programmer decides to defragment the memory. defragmenting is another post. it can be done.