如何在不使指向缓冲区的指针无效的情况下增加缓冲区?
术语“池”和“缓冲区”在这里可以互换使用。
假设我有一个想要在程序开始时分配的池,以免总是调用new
。
现在,我不想人为地限制自己的池大小,但如果我重新分配一个更大的池,所有指向旧池的指针都将失效,这当然不是很酷。
我想到的一种方法是“分页”,又名
const int NUM_PAGES = 5;
char* pool[NUM_PAGES];
分配一个新页面而不是仅重新分配一个页面。这将使所有指针保持有效,但使分页池的管理变得更加困难。另外,我限制了自己的页面数量,所以最后还是限制了池的大小。
另一种方法是从分配函数返回的指针到指向实际内存空间的指针进行映射。这将使所有旧指针保持有效,但会占用更多内存,并且我需要编写一个智能指针以从执行映射的分配函数返回。
还有哪些其他可能的方法可以实现我想要的目标?在上面的示例实现中我错过了哪些(缺点)优点?
The terms 'pool' and 'buffer' may be used interchangeably here.
Suppose I have a pool I want to allocate at the beginning of the programm, as to not always call new
all the time.
Now, I don't want to artificially limit myself on the size of the pool, but if I reallocate a bigger pool, all pointers to the old one will be invalidated, which certainly isn't very cool.
One way I thought of was "paging", aka
const int NUM_PAGES = 5;
char* pool[NUM_PAGES];
And allocate a new page instead of reallocating only one page. This would let all pointers stay valid, but make the management of the paged-pool a bit more difficult. Also, I'm limiting myself on the number of pages, so in the end again on the size of the pool.
Another way was to have a mapping from the pointers my allocation function returns to pointers to the real memory space. This would let all the old pointers stay valid, but would take more memory and I'd need to write a smart pointer to return from my allocation function which does the mapping.
Which other possible ways to achieve what I want are there? What (dis)advantages have I missed in my above example implementations?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
你所说的东西让我想起了
std::deque
。我不太确定您是否可以按原样使用 std::deque,或者您是否只需要使用其基本设计来实现某种分配器。You're talking about something that reminds me of a
std::deque
. I'm not really sure if you can use astd::deque
as is, or if you'll simply need to use its basic design to implement some kind of allocator.扩展您的分页“池”概念,如果您将页面存储为链接列表会怎样?
要从池中分配新数据,您只需要访问顶部“页面”,该页面将位于列表的头部,因此时间复杂度为 O(1)。如果需要增加池的总大小,请分配一个新页面并将其推入列表的头部,也是 O(1)。
我对池分配器基本上使用相同的想法,但也使用最近释放的项目的“空闲列表”......
编辑:
根据您的评论,如果您还想利用已释放的数据,您还可以存储一个空闲列表,也可能作为链接列表。因此,当您释放数据时,您将指针和大小标记推送到空闲列表上。当您从池中分配数据时,首先检查空闲列表上是否有任何项目可以使用,如果没有则从池中分配。
标准内存管理器通常已经做了类似的事情,因此这种方法并不总是更好。具体来说,我通常只在分配的项大小相同时才使用这种类型的自定义分配器(因此空闲列表的遍历是 O(1)!)。 std::list 的自定义分配器就是一个例子。
希望这有帮助。
Extending on your paged "pool" concept, what about if you stored the pages as a linked list??
To allocate new data from the pool you should only need to have access to the top "page", which will be at the head of the list, so that's O(1). If you need to grow the total size of your pool, allocate a new page and push it onto the head of the list, also O(1).
I use basically this same idea for pooled allocators, but also with a "free list" of recently deallocated items...
EDIT:
As per your comment, if you want to also make use of deallocated data, you can also store a free list, possibly also as a linked list. So when you deallocate data you push a pointer and size marker onto the free list. When you allocate data from the pool, you first check if any items on the free list can be used, if not you allocate from the pool.
The standard memory mangers often already do something like this, so this approach will not always be better. Specifically, I usually only use this type of custom allocator when the allocated items are the same size (so that traversal of the free list is O(1)!). A custom allocator for std::list would be one example.
Hope this helps.
考虑使用Boost池
Consider using boost pools
当然,有一个问题是为什么要这么麻烦自己?
您说您希望避免
new
开销,但为什么不选择更好的new
实现呢?例如,tcmalloc
和jemalloc
通常是多线程应用程序的非常好的竞争者。事实上,您尝试创建的内容与编写自定义的
malloc
/new
实现非常相似。因此,如果您确实不想使用经过验证的实现,那么您将从那些使用过的人的见解中受益。我个人的兴趣倾向于 BiBOP 策略(大袋页面)来对抗碎片化。这个想法是每个分配大小都有一个专用池,因此一个简单的空闲列表(与分配交错)就足够了(不需要合并)。通常,只要请求的大小小于页面大小(我见过使用 4KB),并且任何更大的内容都会单独分配(在多个页面中),就会执行此操作。废弃的页面被回收。
我主要感兴趣的是,使用 BiBOP 维护区间树地址范围 -> 很容易。页头,从而从(可能)内部元素(如属性)的地址确定对象的完整大小,这对于垃圾收集很有用(参考以下步骤)。
对于多线程分配,
tcmalloc
和jemalloc
使用两种不同的方法:jemalloc
使用每线程池:适用于固定数量的线程/线程池,但使创建线程的过程更加昂贵tcmalloc
使用全局多线程池,采用无锁技术,并尝试对可用池上的线程进行负载平衡以限制争用如果线程使用的池被“锁定”(而不是等待),则让线程寻找新的池,并让每个线程将最后使用的池缓存在线程局部变量中。因此,线程是轻量级的,但如果池的数量相对于活动线程的数量来说太低,则可能会出现一些争用。One question, of course, is why troubling yourself so ?
You say you wish to avoid the
new
overhead, but why not instead choosing a better implementation ofnew
?tcmalloc
andjemalloc
are generally very good contenders for multi-threaded applications for example.What you are trying to create is very similar, in fact, to writing a customized
malloc
/new
implementation. You would therefore benefit, if you really want not to use a proven implementation, from the insights of those who did.My personal interest leans toward the BiBOP strategy (Big Bag of Pages) to fight off fragmentation. The idea being to have a dedicated pool per size of allocation, and thus a simple free-list (interleaved with the allocations) is enough (no merge required). Usually this is done as long as the requested size is smaller than the page size (I've seen 4KB be used) and anything bigger is allocated on its own (in several pages). Discarded pages are recycled.
The main interest I have is that it's easy with a BiBOP to maintain an interval tree address-range -> page header, thus determining the object full size from the address of a (possibly) internal element (like an attribute), which is useful for Garbage Collection (reference following step).
For multi-threaded allocation,
tcmalloc
andjemalloc
use two different approaches:jemalloc
use per-thread pool: good with a fixed number of threads / thread pools, but make the process of creating a thread more costlytcmalloc
use a global multi-threaded pool, with lock-free technics, and try to load-balance the threads on the pools available to limit contention by having a thread look for a new pool if the one it uses is "locked" (rather than waiting) and having each thread caching the last used pool in a thread local variable. The threads are therefore lightweight, but there might be some contentions if the number of pools is too low wrt the number of active threads.一些想法:
当您有一个
std::vector
时,添加元素并触发调整大小会使该容器中的引用/指针/迭代器无效,但不会使引用/指针无效直接寻址所指向的对象。因此,间接层可能会解决您的问题,具体取决于您真正尝试使用这些引用/指针/迭代器执行的操作。在具有虚拟内存和大地址空间的系统中,您可以进行大量分配,而无需在写入页面之前实际从物理内存资源中分配页面。因此,在此类系统上,您最初可以为
向量
设置比以往所需的容量更大的容量,而不会感觉浪费了任何有价值的东西。其他容器 -
std::map<>
和std::list<>
- 添加新元素时不会移动其现有元素,因此迭代器/指针/引用仍然有效。A few thoughts:
When you have a
std::vector<T*>
, adding elements and triggering a resize invalidates references/pointers/iterators into that container, but not references/pointers that directly address the pointed-to objects. So, a layer of indirection might solve your problem, depending on what you're really trying to do with these references/pointers/iterators.In systems with virtual memory and a large address space, you can make huge allocations without the pages actually being allocated from physical memory resources until they're written to. Consequently, on such systems you can set a larger-than-ever-needed capacity for a
vector
initially without feeling like you're wasting anything valuable.Other containers -
std::map<>
andstd::list<>
- don't move their existing elements as new ones are added, so iterators/pointers/references remain valid.