令人信服的自定义 C++ 示例 分配器?

发布于 2024-07-19 17:23:34 字数 180 浏览 7 评论 0原文

有哪些真正充分的理由放弃 std::allocator 而转而使用自定义解决方案? 您是否遇到过对于正确性、性能、可扩展性等绝对必要的情况? 有什么真正聪明的例子吗?

自定义分配器一直是标准库的一个功能,但我不太需要。 我只是想知道这里是否有人可以提供一些令人信服的例子来证明他们的存在。

What are some really good reasons to ditch std::allocator in favor of a custom solution? Have you run across any situations where it was absolutely necessary for correctness, performance, scalability, etc? Any really clever examples?

Custom allocators have always been a feature of the Standard Library that I haven't had much need for. I was just wondering if anyone here on SO could provide some compelling examples to justify their existence.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(18

小姐丶请自重 2024-07-26 17:23:35

正如我此处提到的,我发现英特尔 TBB 的自定义 STL 分配器显着改进只需将单个更改

std::vector<T>

std::vector<T,tbb::scalable_allocator<T> >

即可提高多线程应用程序的性能(这是切换分配器以使用 TBB 漂亮的线程私有堆的快速便捷的方法;请参阅 本文档第 59 页

As I mention here, I've seen Intel TBB's custom STL allocator significantly improve performance of a multithreaded app simply by changing a single

std::vector<T>

to

std::vector<T,tbb::scalable_allocator<T> >

(this is a quick and convenient way of switching the allocator to use TBB's nifty thread-private heaps; see page 59 in this document)

请你别敷衍 2024-07-26 17:23:35

自定义分配器可以发挥作用的一个领域是游戏开发,尤其是在游戏机上,因为它们只有少量内存并且没有交换。 在此类系统上,您需要确保对每个子系统都有严格的控制,以便非关键系统无法窃取关键系统的内存。 其他诸如池分配器之类的东西可以帮助减少内存碎片。 您可以在以下位置找到有关该主题的长篇详细论文:

EASTL——艺电标准模板库

One area where custom allocators can be useful is game development, especially on game consoles, as they have only a small amount of memory and no swap. On such systems you want to make sure that you have tight control over each subsystem, so that one uncritical system can't steal the memory from a critical one. Other things like pool allocators can help to reduce memory fragmentation. You can find a long, detailed paper on the topic at:

EASTL -- Electronic Arts Standard Template Library

听不够的曲调 2024-07-26 17:23:35

我正在开发一个 mmap 分配器,它允许向量使用来自
内存映射文件。 目标是让向量使用的存储
直接在mmap映射的虚拟内存中。 我们的问题是
改进将大文件(>10GB)读取到内存中的能力,无需复制
开销,因此我需要这个自定义分配器。

到目前为止我已经有了自定义分配器的框架
(源自 std::allocator),我认为这是一个很好的开始
指向编写自己的分配器。 随意使用这段代码
无论您想要什么方式:

#include <memory>
#include <stdio.h>

namespace mmap_allocator_namespace
{
        // See StackOverflow replies to this answer for important commentary about inheriting from std::allocator before replicating this code.
        template <typename T>
        class mmap_allocator: public std::allocator<T>
        {
public:
                typedef size_t size_type;
                typedef T* pointer;
                typedef const T* const_pointer;

                template<typename _Tp1>
                struct rebind
                {
                        typedef mmap_allocator<_Tp1> other;
                };

                pointer allocate(size_type n, const void *hint=0)
                {
                        fprintf(stderr, "Alloc %d bytes.\n", n*sizeof(T));
                        return std::allocator<T>::allocate(n, hint);
                }

                void deallocate(pointer p, size_type n)
                {
                        fprintf(stderr, "Dealloc %d bytes (%p).\n", n*sizeof(T), p);
                        return std::allocator<T>::deallocate(p, n);
                }

                mmap_allocator() throw(): std::allocator<T>() { fprintf(stderr, "Hello allocator!\n"); }
                mmap_allocator(const mmap_allocator &a) throw(): std::allocator<T>(a) { }
                template <class U>                    
                mmap_allocator(const mmap_allocator<U> &a) throw(): std::allocator<T>(a) { }
                ~mmap_allocator() throw() { }
        };
}

要使用它,请按如下方式声明一个 STL 容器:

using namespace std;
using namespace mmap_allocator_namespace;

vector<int, mmap_allocator<int> > int_vec(1024, 0, mmap_allocator<int>());

例如,它可以用于在分配内存时进行记录。 什么是必要的
是重新绑定结构,否则向量容器使用超类分配/释放
方法。

更新:内存映射分配器现已在 https://github.com/johannesthoma/mmap_allocator 上提供,并且LGPL。 请随意将其用于您的项目。

I am working on a mmap-allocator that allows vectors to use memory from
a memory-mapped file. The goal is to have vectors that use storage that
are directly in the virtual memory mapped by mmap. Our problem is to
improve reading of really large files (>10GB) into memory with no copy
overhead, therefore I need this custom allocator.

So far I have the skeleton of a custom allocator
(which derives from std::allocator), I think it is a good starting
point to write own allocators. Feel free to use this piece of code
in whatever way you want:

#include <memory>
#include <stdio.h>

namespace mmap_allocator_namespace
{
        // See StackOverflow replies to this answer for important commentary about inheriting from std::allocator before replicating this code.
        template <typename T>
        class mmap_allocator: public std::allocator<T>
        {
public:
                typedef size_t size_type;
                typedef T* pointer;
                typedef const T* const_pointer;

                template<typename _Tp1>
                struct rebind
                {
                        typedef mmap_allocator<_Tp1> other;
                };

                pointer allocate(size_type n, const void *hint=0)
                {
                        fprintf(stderr, "Alloc %d bytes.\n", n*sizeof(T));
                        return std::allocator<T>::allocate(n, hint);
                }

                void deallocate(pointer p, size_type n)
                {
                        fprintf(stderr, "Dealloc %d bytes (%p).\n", n*sizeof(T), p);
                        return std::allocator<T>::deallocate(p, n);
                }

                mmap_allocator() throw(): std::allocator<T>() { fprintf(stderr, "Hello allocator!\n"); }
                mmap_allocator(const mmap_allocator &a) throw(): std::allocator<T>(a) { }
                template <class U>                    
                mmap_allocator(const mmap_allocator<U> &a) throw(): std::allocator<T>(a) { }
                ~mmap_allocator() throw() { }
        };
}

To use this, declare an STL container as follows:

using namespace std;
using namespace mmap_allocator_namespace;

vector<int, mmap_allocator<int> > int_vec(1024, 0, mmap_allocator<int>());

It can be used for example to log whenever memory is allocated. What is neccessary
is the rebind struct, else the vector container uses the superclasses allocate/deallocate
methods.

Update: The memory mapping allocator is now available at https://github.com/johannesthoma/mmap_allocator and is LGPL. Feel free to use it for your projects.

世界和平 2024-07-26 17:23:35

使用 GPU 或其他协处理器时,有时以特殊方式在主内存中分配数据结构是有益的。 这种分配内存的特殊方式可以在自定义分配器中以方便的方式实现。

使用加速器时,通过加速器运行时进行自定义分配会很有用,原因如下:

  1. 通过自定义分配,加速器运行时或驱动程序会收到内存块的通知
  2. ,此外操作系统还可以确保分配的内存块是页式的。 操作系统的虚拟内存子系统可能无法在内存内或从内存中移动或删除页面
  3. 锁定(有些人称之为固定内存),也就是说,如果 1. 和 2. 保持并且之间存在数据传输, 请求页面锁定内存块和加速器,运行时可以直接访问主内存中的数据,因为它知道数据在哪里,并且可以确定操作系统没有移动/删除它,
  4. 这可以节省可能发生的一次内存复制使用以非页面锁定方式分配的内存:必须将主内存中的数据复制到页面锁定暂存区域,加速器可以初始化数据传输(通过 DMA)

When working with GPUs or other co-processors it is sometimes beneficial to allocate data structures in main memory in a special way. This special way of allocating memory can implemented in a custom allocator in a convenient fashion.

The reason why custom allocation through the accelerator runtime can be beneficial when using accelerators is the following:

  1. through custom allocation the accelerator runtime or driver is notified of the memory block
  2. in addition the operating system can make sure that the allocated block of memory is page-locked (some call this pinned memory), that is, the virtual memory subsystem of the operating system may not move or remove the page within or from memory
  3. if 1. and 2. hold and a data transfer between a page-locked memory block and an accelerator is requested, the runtime can directly access the data in main memory since it knows where it is and it can be sure the operating system did not move/remove it
  4. this saves one memory copy that would occur with memory that was allocated in a non-page-locked way: the data has to be copied in main memory to a page-locked staging area from with the accelerator can initialize the data transfer (through DMA)
⒈起吃苦の倖褔 2024-07-26 17:23:35

我正在使用一个 MySQL 存储引擎,其代码使用 C++。 我们使用自定义分配器来使用 MySQL 内存系统,而不是与 MySQL 竞争内存。 它允许我们确保我们使用的内存是用户配置 MySQL 使用的内存,而不是“额外的”。

I'm working with a MySQL storage engine that uses c++ for its code. We're using a custom allocator to use the MySQL memory system rather than competing with MySQL for memory. It allows us to make sure we're using memory as the user configured MySQL to use, and not "extra".

最后的乘客 2024-07-26 17:23:35

使用自定义分配器来使用内存池而不是堆可能很有用。 这只是众多例子之一。

对于大多数情况来说,这无疑是一种不成熟的优化。 但它在某些情况下(嵌入式设备、游戏等)非常有用。

It can be useful to use custom allocators to use a memory pool instead of the heap. That's one example among many others.

For most cases, this is certainly a premature optimization. But it can be very useful in certain contexts (embedded devices, games, etc).

波浪屿的海角声 2024-07-26 17:23:35

我还没有使用自定义 STL 分配器编写 C++ 代码,但我可以想象用 C++ 编写的 Web 服务器,它使用自定义分配器自动删除响应 HTTP 请求所需的临时数据。 生成响应后,自定义分配器可以立即释放所有临时数据。

自定义分配器(我已经使用过)的另一个可能的用例是编写单元测试来证明函数的行为不依赖于其输入的某些部分。 自定义分配器可以用任何模式填充内存区域。

I haven't written C++ code with a custom STL allocator, but I can imagine a webserver written in C++, which uses a custom allocator for automatic deletion of temporary data needed for responding to a HTTP request. The custom allocator can free all temporary data at once once the response has been generated.

Another possible use case for a custom allocator (which I have used) is writing a unit test to prove that that a function's behavior doesn't depend on some part of its input. The custom allocator can fill up the memory region with any pattern.

撧情箌佬 2024-07-26 17:23:35

我在这里使用自定义分配器; 您甚至可能会说它是为了解决其他自定义动态内存管理问题。

背景:我们有 malloc、calloc、free 的重载,以及运算符 new 和 delete 的各种变体,链接器很乐意让 STL 为我们使用这些。 这让我们可以执行诸如自动小对象池、泄漏检测、分配填充、自由填充、使用哨兵填充分配、某些分配的缓存行对齐以及延迟释放等操作。

问题是,我们在嵌入式环境中运行——没有足够的内存来实际长时间正确地进行泄漏检测计算。 至少,标准 RAM 中没有——通过自定义分配函数,在其他地方还有另一堆 RAM 可用。

解决方案:编写一个使用扩展堆的自定义分配器,并在内存泄漏跟踪架构的内部使用它......其他一切都默认为执行泄漏跟踪的正常 new/delete 重载。 这避免了跟踪器跟踪本身(并且还提供了一些额外的打包功能,我们知道跟踪器节点的大小)。

出于同样的原因,我们还使用它来保存函数成本分析数据; 为每个函数调用和返回以及线程切换编写一个条目可能会很快变得昂贵。 自定义分配器再次为我们提供了更大的调试内存区域中更小的分配。

I'm using custom allocators here; you might even say it was to work around other custom dynamic memory management.

Background: we have overloads for malloc, calloc, free, and the various variants of operator new and delete, and the linker happily makes STL use these for us. This lets us do things like automatic small object pooling, leak detection, alloc fill, free fill, padding allocation with sentries, cache-line alignment for certain allocs, and delayed free.

The problem is, we're running in an embedded environment -- there isn't enough memory around to actually do leak detection accounting properly over an extended period. At least, not in the standard RAM -- there's another heap of RAM available elsewhere, through custom allocation functions.

Solution: write a custom allocator that uses the extended heap, and use it only in the internals of the memory leak tracking architecture... Everything else defaults to the normal new/delete overloads that do leak tracking. This avoids the tracker tracking itself (and provides a bit of extra packing functionality too, we know the size of tracker nodes).

We also use this to keep function cost profiling data, for the same reason; writing an entry for each function call and return, as well as thread switches, can get expensive fast. Custom allocator again gives us smaller allocs in a larger debug memory area.

老娘不死你永远是小三 2024-07-26 17:23:35

自定义分配器是在释放内存之前安全擦除内存的合理方法。

template <class T>
class allocator
{
public:
    using value_type    = T;

    allocator() noexcept {}
    template <class U> allocator(allocator<U> const&) noexcept {}

    value_type*  // Use pointer if pointer is not a value_type*
    allocate(std::size_t n)
    {
        return static_cast<value_type*>(::operator new (n*sizeof(value_type)));
    }

    void
    deallocate(value_type* p, std::size_t) noexcept  // Use pointer if pointer is not a value_type*
    {
        OPENSSL_cleanse(p, n);
        ::operator delete(p);
    }
};
template <class T, class U>
bool
operator==(allocator<T> const&, allocator<U> const&) noexcept
{
    return true;
}
template <class T, class U>
bool
operator!=(allocator<T> const& x, allocator<U> const& y) noexcept
{
    return !(x == y);
}

推荐使用 Hinnant 的分配器样板:
https://howardhinnant.github.io/allocator_boilerplate.html)

A custom allocator is a reasonable way to securely erase memory before it is deallocated.

template <class T>
class allocator
{
public:
    using value_type    = T;

    allocator() noexcept {}
    template <class U> allocator(allocator<U> const&) noexcept {}

    value_type*  // Use pointer if pointer is not a value_type*
    allocate(std::size_t n)
    {
        return static_cast<value_type*>(::operator new (n*sizeof(value_type)));
    }

    void
    deallocate(value_type* p, std::size_t) noexcept  // Use pointer if pointer is not a value_type*
    {
        OPENSSL_cleanse(p, n);
        ::operator delete(p);
    }
};
template <class T, class U>
bool
operator==(allocator<T> const&, allocator<U> const&) noexcept
{
    return true;
}
template <class T, class U>
bool
operator!=(allocator<T> const& x, allocator<U> const& y) noexcept
{
    return !(x == y);
}

Recommend using allocator boilerplate by Hinnant:
https://howardhinnant.github.io/allocator_boilerplate.html)

踏雪无痕 2024-07-26 17:23:35

我正在使用自定义分配器来计算程序一部分中的分配/解除分配数量并测量所需的时间。 还有其他方法可以实现这一点,但这种方法对我来说非常方便。 特别有用的是,我可以仅将自定义分配器用于容器的子集。

I am using a custom allocator for counting the number of allocations/deallocations in one part of my program and measuring how long it takes. There are other ways this could be achieved but this method is very convenient for me. It is especially useful that I can use the custom allocator for only a subset of my containers.

我们的影子 2024-07-26 17:23:35

一种基本情况:在编写必须跨模块(EXE/DLL)边界工作的代码时,必须确保分配和删除仅发生在一个模块中。

我遇到这个问题的地方是 Windows 上的插件架构。 例如,如果您跨 DLL 边界传递 std::string,则该字符串的任何重新分配都发生在它源自的堆中,而不是 DLL 中可能不同的堆中,这一点至关重要*。

*实际上比这更复杂,就好像您动态链接到 CRT 一样,这无论如何都可能起作用。 但是,如果每个 DLL 都有一个到 CRT 的静态链接,那么您将进入一个痛苦的世界,其中不断发生幻象分配错误。

One essential situation: When writing code that must work across module (EXE/DLL) boundaries, it is essential to keep your allocations and deletions happening in only one module.

Where I ran into this was a Plugin architecture on Windows. It is essential that, for example, if you pass a std::string across the DLL boundary, that any reallocations of the string occur from the heap where it originated from, NOT the heap in the DLL which may be different*.

*It's more complicated than this actually, as if you are dynamically linking to the CRT this might work anyways. But if each DLL has a static link to the CRT you are heading to a world of pain, where phantom allocation errors continually occur.

冷情 2024-07-26 17:23:35

Andrei Alexandrescu 在 CppCon 2015 上关于分配器的演讲的强制链接:

https://www.youtube.com/watch? v=LIb3L4vKZ7U

好处是,仅仅设计它们就能让你想到如何使用它们:-)

Obligatory link to Andrei Alexandrescu's CppCon 2015 talk on allocators:

https://www.youtube.com/watch?v=LIb3L4vKZ7U

The nice thing is that just devising them makes you think of ideas of how you would use them :-)

人事已非 2024-07-26 17:23:35

不久前,我发现这个解决方案对我非常有用: Fast C++11 allocator for STL 容器 。 它在 VS2017 (~5x) 和 GCC (~7x) 上略微加快了 STL 容器的速度。 它是一个基于内存池的专用分配器。 仅由于您所要求的机制,它才能与 STL 容器一起使用。

Sometime ago I found this solution very useful to me: Fast C++11 allocator for STL containers. It slightly speeds up STL containers on VS2017 (~5x) as well as on GCC (~7x). It is a special purpose allocator based on memory pool. It can be used with STL containers only thanks to the mechanism you are asking for.

我的鱼塘能养鲲 2024-07-26 17:23:35

我个人使用 Loki::Allocator / SmallObject 来优化小对象的内存使用 - 如果您必须处理适量的小对象(1 到 256 字节),它会显示出良好的效率和令人满意的性能。 如果我们谈论分配适量的许多不同大小的小对象,它的效率可能比标准 C++ 的 new/delete 分配高约 30 倍。 此外,还有一个名为“QuickHeap”的 VC 特定解决方案,它带来了最佳性能(分配和释放操作只需读取和写入正在分配/返回到堆的块的地址,分别在高达 99.(9)% 的情况下— 取决于设置和初始化),但代价是显着的开销 — 每个范围需要两个指针,每个新内存块需要一个额外的指针。 如果您不需要大量的对象大小(它为每个对象大小(从 1 到 1023 字节)创建一个单独的池),那么它是处理创建和删除的大量 (10 000++) 对象的最快解决方案在当前的实现中,因此初始化成本可能会削弱整体性能的提升,但可以在应用程序进入其性能关键阶段之前继续分配/取消分配一些虚拟对象。

标准 C++ new/delete 实现的问题在于,它通常只是 C malloc/free 分配的包装器,并且它适用于较大的内存块,例如 1024+ 字节。 它在性能方面有显着的开销,有时还需要额外的内存用于映射。 因此,在大多数情况下,自定义分配器的实现方式可以最大限度地提高性能和/或最大限度地减少分配小(≤1024 字节)对象所需的额外内存量。

I personally use Loki::Allocator / SmallObject to optimize memory usage for small objects — it show good efficiency and satisfying performance if you have to work with moderate amounts of really small objects (1 to 256 bytes). It can be up to ~30 times more efficient than standard C++ new/delete allocation if we talk about allocating moderate amounts of small objects of many different sizes. Also, there's a VC-specific solution called "QuickHeap", it brings best possible performance (allocate and deallocate operations just read and write the address of the block being allocated/returned to heap, respectively in up to 99.(9)% cases — depends on settings and initialization), but at a cost of a notable overhead — it needs two pointers per extent and one extra for each new memory block. It's a fastest possible solution for working with huge (10 000++) amounts of objects being created and deleted if you don't need a big variety of object sizes (it creates an individual pool for each object size, from 1 to 1023 bytes in current implementation, so initialization costs may belittle the overall performance boost, but one can go ahead and allocate/deallocate some dummy objects before the application enters it's performance-critical phase(s)).

The issue with the standard C++ new/delete implementation is that it's usually just a wrapper for C malloc/free allocation, and it works good for larger blocks of memory, like 1024+ bytes. It has a notable overhead in terms of performance and, sometimes, extra memory used for mapping too. So, in most cases custom allocators are implemented in a way to maximize the performance and/or minimize the amount of extra memory needed for allocating small (≤1024 bytes) objects.

柳若烟 2024-07-26 17:23:35

对于共享内存来说,至关重要的是,不仅容器头,而且它包含的数据都存储在共享内存中。

Boost::Interprocess 的分配器 就是一个很好的例子。 但是,正如您可以阅读 这里这一切还不足以使所有STL容器共享内存兼容(由于不同进程中的映射偏移量不同,指针可能会“中断”)。

For shared memory it is vital that not only the container head, but also the data it contains are stored in shared memory.

The allocator of Boost::Interprocess is a good example. However, as you can read here this allone does not suffice, to make all STL containers shared memory compatible (Due to different mapping offsets in different processes, pointers might "break").

酷炫老祖宗 2024-07-26 17:23:35

在图形模拟中,我见过用于

  1. std::allocator 不直接支持的对齐约束的自定义分配器。
  2. 通过对短期(仅此帧)和长期分配使用单独的池来最大限度地减少碎片。

In a graphics simulation, I've seen custom allocators used for

  1. Alignment constraints that std::allocator didn't directly support.
  2. Minimizing fragmentation by using separate pools for short-lived (just this frame) and long-lived allocations.
柒七 2024-07-26 17:23:35

OP 要求提供一个使用自定义分配器的充分理由。 各种答案中已经给出了许多很好的个人理由。 不过,可以提出一个通用的论点。

任何通用分配器都必须处理各种不同的使用模式或分配趋势,这些模式涵盖了多种可能性。 构建一个通用解决方案相当具有挑战性,该解决方案在平均情况下表现良好,同时在某些非常具体的情况下也不会表现极差。 使用模式的多样性以及它们之间的极端差异往往会对通用分配器解决方案可以实现的性能产生某种边界限制。

但是,如果我们知道特定的使用趋势强>并且可以确保分配器仅在某些有限且明确定义的情况下使用,有可能产生更好的结果,甚至在禁止使用通用解决方案的情况下使用容器(例如嵌入式和有限系统)。

为了用一个例子来说明这一点(我使用了一个自定义分配方案并取得了巨大成功):假设......

  • 一系列对象是基于某些发现/评估/转换创建的,并且立即交叉连接。
  • 然后,根据外部情况,在紧密循环中使用这些对象一段时间,
  • 我们知道,在给定时间点之后,所有这些对象都不能再使用。然后,

自定义分配器可以声明一些大块,最好是紧密分配在一起,放置将所有对象放入该区域,将所有清理作为无操作来实现,从而省略任何内部管理基础设施; 相反,分配的块将被放弃。

The OP asked for a good reason to use a custom allocator. Many good individual reasons were already given in the various answers. A generic argument can be made though.

Any general-purpose allocator must deal with various different usage-patterns or allocation-trends, which man span a wide array of possibilities. It is rather challenging to build a generic solution, which behaves well on average while also not performing extremely bad under some very specific situations. The diversity of usage patterns and the extreme disparity between them tends to generate some kind of borderline limit on the performance a generic allocator solution can achieve.

However, if we know the specific usage trend and can ensure the allocator is only used under some limited and well defined circumstances, it is possible to yield better results or even use a container in a situation where using the generic solution would be prohibitive (e.g. embedded and limited system).

To illustrate this with one example (where I used a custom allocation scheme with great success): assuming that...

  • a slate of objects is created based on some discovery / evaluation / translation and is cross wired immediately.
  • these objects are then used for some time in a tight loop
  • based on external circumstances we know that all these objects can not be used after a given point in time any more

A custom allocator can then claim some large blocks, ideally allocated close together, place all objects into that area, implement all clean-up as a no-op and thus omit having any internal management infrastructure; rather, the allocated blocks will just be abandoned.

风尘浪孓 2024-07-26 17:23:35

我使用这些的一个例子是处理资源非常有限的嵌入式系统。 假设您有 2k 空闲内存,并且您的程序必须使用其中的一些内存。 您需要将 4-5 个序列存储在不在堆栈上的某个位置,此外您还需要非常精确地访问这些内容的存储位置,在这种情况下您可能需要编写自己的分配器。 默认实现可能会产生内存碎片,如果您没有足够的内存并且无法重新启动程序,这可能是不可接受的。

我正在进行的一个项目是在一些低功耗芯片上使用 AVR-GCC。 我们必须存储 8 个长度可变但最大值已知的序列。 内存管理的标准库实现是一个薄包装器malloc/free 通过在每个分配的内存块前面加上一个指向该分配的内存块末尾的指针来跟踪放置项目的位置。 当分配新的内存块时,标准分配器必须遍历每个内存块,以找到适合所请求的内存大小的下一个可用块。 在桌面平台上,这对于这几个项目来说非常快,但您必须记住,相比之下,其中一些微控制器非常慢且原始。 此外,内存碎片问题是一个大问题,这意味着我们别无选择,只能采取不同的方法。

所以我们所做的就是实现我们自己的内存池。 每个内存块都足够大,可以容纳我们需要的最大序列。 这会提前分配固定大小的内存块并标记当前正在使用哪些内存块。 我们通过保留一个 8 位整数来做到这一点,其中每一位代表是否使用了某个块。 我们在这里权衡了内存使用量,试图使整个过程更快,在我们的例子中,这是合理的,因为我们正在推动这个微控制器芯片接近其最大处理能力。

还有很多时候,我可以看到在嵌入式系统的上下文中编写自己的自定义分配器,例如,如果序列的内存不在主内存中,那么 这些平台

One example of I time I have used these was working with very resource constrained embedded systems. Lets say you have 2k of ram free and your program has to use some of that memory. You need to store say 4-5 sequences somewhere that's not on the stack and additionally you need to have very precise access over where these things get stored, this is a situation where you might want to write your own allocator. The default implementations can fragment the memory, this might be unacceptable if you don't have enough memory and cannot restart your program.

One project I was working on was using AVR-GCC on some low powered chips. We had to store 8 sequences of variable length but with a known maximum. The standard library implementation of the memory management is a thin wrapper around malloc/free which keeps track of where to place items with by prepending every allocated block of memory with a pointer to just past the end of that allocated piece of memory. When allocating a new piece of memory the standard allocator has to walk over each of the pieces of memory to find the next block that is available where the requested size of memory will fit. On a desktop platform this would be very fast for this few items but you have to keep in mind that some of these microcontrollers are very slow and primitive in comparison. Additionally the memory fragmentation issue was a massive problem that meant we really had no choice but to take a different approach.

So what we did was to implement our own memory pool. Each block of memory was big enough to fit the largest sequence we would need in it. This allocated fixed sized blocks of memory ahead of time and marked which blocks of memory were currently in use. We did this by keeping one 8 bit integer where each bit represented if a certain block was used. We traded off memory usage here for attempting to make the whole process faster, which in our case was justified as we were pushing this microcontroller chip close to it's maximum processing capacity.

There's a number of other times I can see writing your own custom allocator in the context of embedded systems, for example if the memory for the sequence isn't in main ram as might frequently be the case on these platforms.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文