带有空闲列表的无锁堆栈:为什么下一个指针不需要是原子的?

发布于 2025-01-11 01:14:16 字数 1687 浏览 3 评论 0原文

无锁堆栈可以实现为单链表。这看起来很简单,直到我们必须考虑在弹出节点后如何处理它们。一种策略是简单地将它们移动到每个堆栈的后进先出空闲列表(后续的推送操作可以重用节点),直到最终所有线程都完成堆栈,此时单个线程会销毁堆栈中的所有节点和所有节点。空闲列表中的节点。 Boost.Lockfree 使用此策略。 Chris Wellons 的 C11 实现也是如此。我将参考后者,因为它更容易阅读,而且细节基本相同,因为 C11 原子与 C++11 原子非常相似。

在 Wellons 的实现中(可以在 GitHub 此处 上找到),所有 lstack_node 对象是非原子的。特别是,这意味着对 lstack_node 对象的 next 成员的所有访问都是非原子的。我无法理解的是:为什么此类访问从不相互竞争?

next 成员在 lstack.c 处读取:30。它写于lstack.c:39。如果这两行代码可以在同一个 lstack_node 对象上同时执行,则该程序包含竞争。这可能吗?在我看来这是可能的:

  • 线程 1 调用 lstack_pop,它又调用 pop。它以原子方式将头节点的值加载到局部变量orig中。现在,orig.node 是指向刚刚位于堆栈顶部的节点的指针。 (请注意,到目前为止,仅修改了局部变量,因此线程 1 迄今为止所做的任何操作都不可能导致任何其他线程中的 CAS 失败。)同时...
  • 线程 2 调用 lstack_pop 。 pop成功并返回node,指向刚刚从堆栈中切除的节点的指针;这与线程 1 中 orig.node 指向的节点相同。然后它开始调用 push 以便将 node 添加到空闲列表。 freelist头节点被原子加载,并且node->next被设置为指向freelist中的第一个节点。
  • 哎呀。这与线程 1 中对 orig.node->next 的读取发生竞争。Wellons

的实现可能就是错误的吗?我对此表示怀疑。如果他的实现是错误的,那么 Boost 的实现也是错误的,因为修复(在我看来)竞争条件的唯一方法是使 next 原子化。但我不认为 Boost 的实现可能会在如此基本的方式上出现错误,除非现在已经注意到并修复了它。所以我的推理一定是错误的。

A lock-free stack can be implemented as a singly linked list. This seems simple until we have to think about what to do with nodes after they've been popped. One strategy is to simply move them to a per-stack LIFO freelist (from which nodes can be reused by subsequent push operations) until eventually all threads are done with the stack, at which point a single thread destroys all nodes in the stack and all nodes in the freelist. Boost.Lockfree uses this strategy. So does Chris Wellons's C11 implementation. I will refer to the latter, because it's easier to read and the details are essentially the same, since C11 atomics are very similar to C++11 atomics.

In Wellons's implementation, which can be found on GitHub here, all lstack_node objects are non-atomic. In particular, this means that all accesses to the next member of an lstack_node object are non-atomic. What I am unable to understand is: why is it that such accesses never race with each other?

The next member is read at lstack.c:30. It is written at lstack.c:39. If these two lines can execute concurrently on the same lstack_node object, then the program contains a race. Is this possible? It seems possible to me:

  • Thread 1 calls lstack_pop, which calls pop. It atomically loads the head node's value into the local variable orig. Now, orig.node is a pointer to the node that was at the top of the stack just now. (Note that up until this point, only local variables have been modified, so it is impossible for anything that thread 1 has done so far to make a CAS fail in any other thread.) Meanwhile...
  • Thread 2 calls lstack_pop. pop succeeds and returns node, a pointer to the node that has just been excised from the stack; this is the same node that orig.node points to in thread 1. It then begins to call push in order to add node to the freelist. The freelist head node is atomically loaded, and node->next is set to point to the first node in the freelist.
  • Oops. This races with the read to orig.node->next in thread 1.

Could Wellons's implementation simply be wrong? I doubt it. If his implementation is wrong, then so is the Boost one, because the only way to fix (what appears to me to be) the race condition is to make next atomic. But I don't think the Boost implementation could be wrong in such a basic way without it having been noticed and fixed by now. So I must have made a mistake in my reasoning.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

路弥 2025-01-18 01:14:16

需要注意的关键是,当前链表中的每个节点的 next 字段都是只读的。仅当该节点已成功从链表中删除时才能修改下一个。一旦发生这种情况,删除它的线程将“拥有”它,并且其他人都无法明智地查看它(他们可能读取一个值,但是当他们的compare_and_set失败时该值将被丢弃。)因此拥有线程可以安全地修改下一个字段,作为将其推送到另一个列表的一部分。

在您的假设中,您忽略了这样一个事实:两个弹出(由两个线程完成)不能成功并返回相同的节点。如果两个线程尝试同时弹出,它们可能会获得相同的节点指针,但其中一个线程会在compare_and_set原子指令中失败,并且会使用不同的节点指针循环返回。

这确实要求读/写竞争是“安全的”——也就是说,当两个线程之间进行读/写竞争时,读取器可能会获得任何值,但不会失败(没有陷阱或其他未定义的行为),并且否则不会干扰写入,但大多数(所有?)硬件上往往都是这种情况。只要读取器不依赖于比赛期间读取的值,您就可以在事后检测比赛并忽略该值。

The key thing to note is that the next fields are read-only for every node that is currently in a linked list. The next can only be modified when the node has been successfully removed from a linked list. Once that happens, the thread that removed it 'owns' it and noone else can sensibly look at it (they might read a value, but that value will be thrown away when their compare_and_set fails.) So that owning thread can safely modify the next field as part of pushing it on another list.

In your hypothetical, you're missing the fact that the two pops (done by the two threads) cannot both succeed and return the same node. If two threads try to simultaneously pop, they might get the same node pointer, but one will fail in the compare_and_set atomic instruction and will loop back with a different node pointer.

This does require that read/write races are "safe" -- that is when you have a read/write race between two threads, the reader might get any value but won't otherwise fail (no trap or other undefined behavior), and won't otherwise interfere with the write, but that tends to be the case on most (all?) hardware. As long as the reader does not depend on the value read during a race, you can detect the race after the fact and disregard that value.

三生池水覆流年 2025-01-18 01:14:16

我只是写了一篇长文试图解释为什么不能有竞争,直到我仔细研究了 Wellson 是如何实现空闲列表的,我得出的结论是你是对的!

这里重要的一点是你在问题一开始提到的:

这看起来很简单,直到我们必须考虑在弹出节点后如何处理它们。一种策略是简单地将它们移至每个堆栈的 LIFO 空闲列表,直到最终所有线程都完成堆栈的操作,此时单个线程会销毁堆栈中的所有节点以及空闲列表中的所有节点。

但这不是 Wellson 实现中的空闲列表的工作原理!相反,它会尝试重用空闲列表中的节点,但正如您所观察到的那样,next 也需要是原子的。如果空闲列表已按照您所描述的方式实现,即弹出的节点将被添加到某个空闲列表中(未更改,即,空闲列表使用不同 的指针< code>next),并且只有在堆栈不再被任何线程使用时才释放,那么 next 可以是一个普通变量,因为一旦节点成功推送,它就不会改变。

这并不一定意味着 boost 无锁队列也是不正确的,但我不知道代码是否足够好,无法对 boost 实现做出合格的声明。

FWIW - 这通常称为内存回收问题。这种空闲列表方法是一个简单的解决方案,但通常不适用于现实场景。对于现实场景,您可能希望使用内存回收方案,例如危险指针或基于纪元的回收。你可以看看我的 xenium 库,我在其中实现了几种不同的回收方案以及锁定 -使用它们的免费数据结构。有关内存回收问题和我在 xenium 中的实现的更多信息也可以在我的论文 有效记忆C++ 中无锁数据结构的回收

I just wrote a long text trying to explain why there cannot be a race, until I took a closer look at how Wellson implemented the free list, and I came to the conclusion that you are correct!

The important point here is what you mentioned at the very beginning of your question:

This seems simple until we have to think about what to do with nodes after they've been popped. One strategy is to simply move them to a per-stack LIFO freelist until eventually all threads are done with the stack, at which point a single thread destroys all nodes in the stack and all nodes in the freelist.

But that is not how the freelist in Wellson's implementation works! Instead it tries to reuse nodes from the freelist, but then next also needs to be atomic as you observed correctly. If the free list would have been implemented as you described, i.e., the popped nodes would be added to some freelist (unaltered, i.e., the freelist uses a different pointer than next) and only released once the stack is no longer used by any thread, then next could be a plain variable since it would not be change once the node has been pushed successfully.

That does not necessarily mean that the boost lock-free queue is also incorrect, but I don't know the code good enough to make a qualified statement about the boost implementation.

FWIW - this is usually referred as the memory reclamation problem. This freelist approach is a simple solution, though usually not practical for real-world scenarios. For a real-world scenario you probably want to use a memory reclamation scheme like hazard pointers or epoch based reclamation. You can take a look at my xenium library where I have implemented several different reclamation schemes as well as lock-free data structures that use them. More information about the memory reclamation problem and my implementations in xenium can be also found in my thesis Effective memory reclamation for lock-free data structures in C++.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文