圆形缓冲器对比。锁定空闲堆栈以实现空闲列表
当我为了好玩而编写一些多线程代码时,我想到了以下情况:
一个线程从内存池中声明一个资源单元,它处理它并将指向该数据的指针发送到另一个线程以进行进一步操作循环缓冲区(1R / 1W 情况)。
后者必须在处理完接收到的数据时通知前一个线程,以便回收内存。
我想知道是否更好 - 性能方面 - 将这个“Freelist”实现为另一个循环缓冲区 - 保存空闲资源的地址 - 或者选择无锁堆栈方式(在 x86-64 上实现 DCAS)。
一般来说,两种不同方法的优缺点是什么?
As I have been writing some multi-threaded code for fun, I came up with the following situation:
a thread claims a single resource unit from a memory pool, it processes it and sends a pointer to this data to another thread for further operation using a circular buffer (1R / 1W case).
The latter must inform the former thread whenever it is done with the data he received, so that the memory can be recycled.
I wonder whether it is better - performance-wise - to implement this "Freelist" as another circular buffer - holding the addresses of free resources - or choose the lock-free stack way (implementing DCAS on x86-64).
Generally speaking, what could be the pros and the cons of the two different approaches ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
为了以防万一,无锁和无等待之间是有区别的。前者意味着没有锁定,但线程仍然可以忙自旋而没有取得任何进展。后者意味着线程总是在没有锁定或忙自旋的情况下取得进展。
对于一个读取器和一个写入器,无锁和无等待的 FIFO 循环缓冲区的实现很简单。
我听说 LIFO 堆栈也可以免等待,但对 FIFO 列表不太确定。听起来你在这里需要一个队列而不是一个堆栈。
Just in case, there is a difference between lock-free and wait-free. The former means there is no locking but the thread can still busy-spin not making any progress. The latter means that the thread always makes progress with no locking or busy-spinning.
With one reader and one writer lock-free and wait-free FIFO circular buffer is trivial to implement.
I hear that LIFO stack can also be made wait-free, but not so sure about FIFO list. And it sound like you need a queue here rather then a stack.
主要区别是循环缓冲区是有界的,而堆栈则没有。
如果不进行测试,很难对此类事情做出性能判断。一方面,循环缓冲区由连续数组支持。如果读取器和写入器索引彼此保持“接近”,则每个线程都会不断地使共享缓存行失效。
另一方面,对于堆栈,您可能会争夺堆栈顶部指针,从而导致线程有时在 CAS 循环中旋转。
我的猜测是,最好的选择取决于工作负载。
The main difference is the circular buffer will be bounded, while the stack will not.
It's hard to make a performance judgement on things like this without testing. On the one hand, the circular buffer is backed by a contiguous array. If the reader and writer indices remain "near" each other, you'll have each thread constantly invalidating a shared cache line.
On the other hand, with a stack you can have contention for the top-of-stack pointer, resulting in threads sometimes spinning in the CAS loop.
My guess would be that the best choice is workload-dependent.