如何用C语言编写一个线程安全、高效、无锁的内存分配器?

发布于 2024-08-16 20:23:38 字数 147 浏览 6 评论 0原文

如何用C语言编写一个线程安全、高效、无锁的内存分配器?我所说的高效是指:

  1. 快速分配和快速分配。解除分配

  2. 最佳内存使用(最小浪费,无外部碎片)

  3. 最小元数据开销

How to write a thread-safe and efficient, lock-free memory allocator in C? By efficient I mean:

  1. Fast allocation & deallocation

  2. Optimal memory usage (minimal wastage and no external fragmentation)

  3. Minimal meta-data overhead

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

世界和平 2024-08-23 20:23:38

http://www.research.ibm.com/people/ m/michael/pldi-2004.pdf

本文提出了一种完全无锁的内存分配器。它仅使用广泛可用的操作系统支持和硬件原子指令。即使在任意线程下,它也能提供有保证的可用性
终止和崩溃失败,并且无论调度策略如何,它都不会出现死锁,因此可以
甚至在中断处理程序和实时应用程序中使用
不需要特殊的调度程序支持。此外,通过利用 Hoard 的一些高级结构,我们的分配器
具有高度可扩展性,将空间膨胀限制在一个常数因子内,
并且能够避免错误共享...

http://www.research.ibm.com/people/m/michael/pldi-2004.pdf

This paper presents a completely lock-free memory allocator. It uses only widely-available operating system support and hardware atomic instructions. It offers guaranteed availability even under arbitrary thread
termination and crash-failure, and it is immune to dead-lock regardless of scheduling policies, and hence it can be
used even in interrupt handlers and real-time applications
without requiring special scheduler support. Also, by leveraging some high-level structures from Hoard, our allocator
is highly scalable, limits space blowup to a constant factor,
and is capable of avoiding false sharing...

电影里的梦 2024-08-23 20:23:38

取决于你所说的效率是什么意思。如果我关心的是让事情变得更快,那么我可能会给每个线程它自己的单独的内存池来使用,以及一个从该池中获取内存的自定义“malloc”。当然,如果我关心的是速度,我可能会首先避免分配。

没有一个答案;您将平衡一系列问题。获得无锁分配器几乎是不可能的,但是您可以尽早且不频繁地进行锁定(通过为每个线程分配大池),或者您可以使锁变得如此小和紧密以至于它们必须是正确的。

Depends on what you mean by efficiency. If my concern was to make things fast, then I would probably give each thread it's own separate memory pool to work with, and a custom 'malloc' that took memory from that pool. Of course, if my concern was speed, I would probably avoid allocation in the first place.

There is no one answer; you'll be balancing a range of concerns. It will be pretty much impossible to get a lock-free allocator, but you can either do the locking early and infrequently ( by allocating large pools for each thread ) or you can make the locks so small and tight that they must be correct.

若有似无的小暗淡 2024-08-23 20:23:38

您可以使用无锁列表和几个不同大小的存储桶。

所以:

typedef struct
{
    union{
        SLIST_ENTRY entry;
    void* list;
};
byte mem[];
} mem_block;

typedef struct
{
    SLIST_HEADER root;
} mem_block_list;

#define BUCKET_COUNT 4
#define BLOCKS_TO_ALLOCATE 16

static mem_block_list Buckets[BUCKET_COUNT];

void init_buckets()
{
    for( int i = 0; i < BUCKET_COUNT; ++i )
    {
        InitializeSListHead( &Buckets[i].root );
        for( int j = 0; j < BLOCKS_TO_ALLOCATE; ++j )
        {
            mem_block* p = (mem_block*) malloc( sizeof( mem_block ) + (0x1 << BUCKET_COUNT) * 0x8 );
            InterlockedPushEntrySList( &Buckets[i].root, &p->entry );
        }
    }
}

void* balloc( size_t size )
{
    for( int i = 0; i < BUCKET_COUNT; ++i )
    {
        if( size <= (0x1 << i) * 0x8 )
        {
            mem_block* p = (mem_block*) InterlockedPopEntrySList( &Buckets[i].root );
            p->list = &Buckets[i];
        }
    }

    return 0;   // block to large
}

void  bfree( void* p )
{
    mem_block* block = (mem_block*) (((byte*)p) - sizeof( block->entry ));
    InterlockedPushEntrySList( ((mem_block_list*)block)->root, &block->entry );
}

SLIST_ENTRY、InterlockedPushEntrySList、InterlockedPopEntrySList、InitializeSListHead是Win32下无锁单链表操作的函数。其他操作系统请参考相应操作。

缺点:

  • sizeof( SLIST_ENTRY ) 的开销
  • 存储桶只能在开始时有效增长一次,之后您可能会耗尽内存,并且必须询问操作系统/其他存储桶。 (其他桶会导致碎片)
  • 这个示例有点太简单了,必须扩展以处理更多情况

You can use a lock free list and a couple of buckets of differing sizes.

So :

typedef struct
{
    union{
        SLIST_ENTRY entry;
    void* list;
};
byte mem[];
} mem_block;

typedef struct
{
    SLIST_HEADER root;
} mem_block_list;

#define BUCKET_COUNT 4
#define BLOCKS_TO_ALLOCATE 16

static mem_block_list Buckets[BUCKET_COUNT];

void init_buckets()
{
    for( int i = 0; i < BUCKET_COUNT; ++i )
    {
        InitializeSListHead( &Buckets[i].root );
        for( int j = 0; j < BLOCKS_TO_ALLOCATE; ++j )
        {
            mem_block* p = (mem_block*) malloc( sizeof( mem_block ) + (0x1 << BUCKET_COUNT) * 0x8 );
            InterlockedPushEntrySList( &Buckets[i].root, &p->entry );
        }
    }
}

void* balloc( size_t size )
{
    for( int i = 0; i < BUCKET_COUNT; ++i )
    {
        if( size <= (0x1 << i) * 0x8 )
        {
            mem_block* p = (mem_block*) InterlockedPopEntrySList( &Buckets[i].root );
            p->list = &Buckets[i];
        }
    }

    return 0;   // block to large
}

void  bfree( void* p )
{
    mem_block* block = (mem_block*) (((byte*)p) - sizeof( block->entry ));
    InterlockedPushEntrySList( ((mem_block_list*)block)->root, &block->entry );
}

SLIST_ENTRY, InterlockedPushEntrySList, InterlockedPopEntrySList, InitializeSListHead are functions for lock-free single-linked-list operations under Win32. Use the according operations on other OSes.

Drawbacks :

  • Overhead of sizeof( SLIST_ENTRY )
  • The buckets can only grow efficiently once at the start, after that you can run out of memory and have to ask the OS/other buckets. (Other buckets leads to fragmentation)
  • This sample is a bit too easy and must be extended to handle more cases
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文