同步访问双向链表

发布于 2024-10-05 06:35:54 字数 1004 浏览 4 评论 0原文

我正在尝试在 pthreads 环境中用 C 实现一个（特殊类型的）双向链表，但仅使用 C 包装的同步指令（如原子 CAS 等）而不是 pthread 原语。（列表的元素是固定大小的内存块，几乎肯定不能容纳 pthread_mutex_t 等。）我实际上不需要完整的任意双向链表方法，只需要：

插入中任意点的列表末尾
从列表开头删除列表
根据指向要删除的成员的指针，

，该指针是从除遍历列表之外的源获取的。因此，也许描述此数据结构的更好方法是队列/fifo，并且可以删除队列中的项目。

是否有一个标准方法来同步这个？我陷入了可能的死锁问题，其中一些问题可能是所涉及的算法所固有的，而另一些问题可能源于这样一个事实：我试图在一个有限的空间中工作，而我能做的事情受到其他限制。

编辑：特别是，如果要同时删除相邻对象，我该怎么办？据推测，在删除对象时，您需要获取列表中前一个和下一个对象的锁，并更新它们的 next/prev 指针以指向彼此。但如果任何一个邻居已经被锁定，这将导致死锁。我试图找出一种方法，使任何/所有发生的删除都可以遍历列表的锁定部分，并确定当前正在删除过程中的最大子列表，然后锁定与该子列表相邻的节点，以便整个子列表被整体删除，但我的头开始受伤了.. :-P

结论(?)：为了跟进，我确实有一些代码想要开始工作，但我'我对理论问题也感兴趣。每个人的答案都非常有帮助，并结合了我在这里表达的约束之外的详细信息（您真的不想知道指向要删除的元素的指针来自哪里以及其中涉及的同步！）我决定暂时放弃本地锁代码并专注于：

使用大量较小的列表，每个列表都有单独的锁。
最大限度地减少持有锁的指令数量，并在获取锁之前（以安全的方式）访问内存，以减少持有锁时发生页面错误和缓存未命中的可能性。
测量人为高负载下的争用并评估该方法是否令人满意。

再次感谢所有给出答案的人。如果我的实验进展不顺利，我可能会回到概述的方法（尤其是弗拉德的）并重试。

原文

I'm trying to implement a (special kind of) doubly-linked list in C, in a pthreads environment but using only C-wrapped synchronization instructions like atomic CAS, etc. rather than pthread primitives. (The elements of the list are fixed-size chunks of memory and almost surely cannot fit pthread_mutex_t etc. inside them.) I don't actually need full arbitrary doubly-linked list methods, only:

insertion at the end of the list
deletion from the beginning of the list
deletion at arbitrary points in the list based on a pointer to the member to be removed, which was obtained from a source other than by traversing the list.

So perhaps a better way to describe this data structure would be a queue/fifo with the possibility of removing items mid-queue.

Is there a standard approach to synchronizing this? I'm getting stuck on possible deadlock issues, some of which are probably inherent to the algorithms involved and others of which might stem from the fact that I'm trying to work in a confined space with other constraints on what I can do.

Edit: In particular, I'm stuck on what to do if adjacent objects are to be removed simultaneously. Presumably when removing an object, you need to obtain locks on both the previous and next objects in the list and update their next/prev pointers to point to one another. But if either neighbor is already locked, this would result in a deadlock. I've tried to work out a way that any/all of the removals taking place could walk the locked part of the list and determine the maximal sublist that's currently in the process of removal, then lock the nodes adjacent to that sublist so that the whole sublist gets removed as a whole, but my head is starting to hurt.. :-P

Conclusion(?): To follow up, I do have some code I want to get working, but I'm also interested in the theoretical problem. Everyone's answers have been quite helpful, and combined with details of the constraints outside what I expressed here (you really don't want to know where the pointer-to-element-to-be-removed came from and the synchronization involved there!) I've decided to abandon the local-lock code for now and focus on:

using a larger number of smaller lists which each have individual locks.
minimizing the number of instructions over which locks are held and poking at memory (in a safe way) prior to acquiring a lock to reduce the possibility of page faults and cache misses while a lock is held.
measuring the contention under artificially-high load and evaluating whether this approach is satisfactory.

Thanks again to everybody who gave answers. If my experiment doesn't go well I might come back to the approaches outlined (especially Vlad's) and try again.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

葵雨 2024-10-12 06:35:54

为什么不直接应用粗粒度锁呢？只需锁定整个队列即可。

更复杂的（但不一定更有效，取决于您的使用模式）解决方案是使用读写锁，分别用于读取和写入。

在我看来，使用无锁操作对于您的情况来说并不是一个好主意。想象一下某个线程正在遍历您的队列，同时“当前”项目被删除。无论您的遍历算法拥有多少额外链接，所有这些项目都可能被删除，因此您的代码将没有机会完成遍历。

比较和交换的另一个问题是，使用指针，您永远不知道它是否真的指向同一个旧结构，或者旧结构已被释放，并且在同一地址分配了一些新结构。对于您的算法来说，这可能是也可能不是问题。

对于“本地”锁定的情况（即，可以单独锁定每个列表项），一个想法是使锁有序。对锁进行排序可确保不会出现死锁。所以你的操作是这样的：

通过指针 p 删除到上一个项目：

锁定 p，检查（可能使用项目中的特殊标志）该项目仍在列表中
lock p->;接下来，检查它是否不为零并且在列表中；这样您就可以确保 p->next->next 不会同时被删除
lock p->next->next
在 p->next 中设置一个标志，表明它不在列表中
（ p->下一个->下一个->上一个，p->下一个->上一个) = (p, null); (p->next, p->next->next) = (p->next->next, null)
释放锁

插入开头：

锁头
在新项中设置标志，表明它在列表中
锁定新项目
锁定 head->next
(head->next->prev, new->prev) = (new, head); (new->next, head) = (head, new)
释放锁

这似乎是正确的，但我没有尝试这个想法。

本质上，这使得双链表像单链表一样工作。

如果您没有指向前一个列表元素的指针（当然通常是这种情况，因为实际上不可能使这样的指针保持一致状态），您可以执行以下操作：

通过指向该项目的指针 c 删除待删除：

锁定c，检查它是否仍然是列表的一部分（这必须是列表项中的标志），如果不是，操作失败
获取指针 p = c->prev
解锁c（现在，c可能被其他线程移动或删除，p也可能从列表中移动或删除）[为了避免c的释放，您需要有类似共享指针的东西，或者至少是一种对列表项的引用计数]
lock p
检查 p 是否是列表的一部分（可以在步骤 3 后将其删除）；如果不是，解锁 p 并从头开始
检查 p->next 是否等于 c，如果不等于，则解锁 p 并从头开始[这里我们可能可以优化重新启动，不确定 ATM]
锁定 p->next ;这里可以确定 p->next==c 并没有被删除，因为删除 c 需要锁定 p
锁 p->next->next；现在所有的锁都已被占用，因此我们可以继续
设置标志 c 不是列表的一部分，
执行常规操作 (p->next, c->next, c->prev, c->next ->prev) = (c->next, null, null, p)
释放所有锁

请注意，仅拥有指向某个列表项的指针并不能确保该项不会被释放，因此您需要有一个一种重新计数，这样该项目就不会在您尝试锁定它的那一刻被销毁。

请注意，在最后一个算法中，重试次数是有限的。确实，新项不能出现在c的左边（插入是在最右边的位置）。如果我们的步骤 5 失败，因此我们需要重试，这只能是由于同时从列表中删除 p 造成的。这样的删除最多可以发生 N-1 次，其中 N 是 c 在列表中的初始位置。当然，这种最坏的情况不太可能发生。

Why not just apply a coarse-grained lock? Just lock the whole queue.

A more elaborate (however not necessarily more efficient, depends on your usage pattern) solution would be using a read-wrote lock, for reading and writing, respectively.

Using lock-free operations seem to me not a very good idea for your case. Imagine that some thread is traversing your queue, and at the same moment the "current" item is deleted. Doesn't matter how many additional links your traverse algorithm holds, all that items may be deleted, so your code would have no chance to finish the traversal.

Another issue with compare-and-swap is that with pointers you never know whether it really points to the same old structure, or the old structure has been freed and some new structure is allocated at the same address. This may or may not be an issue for your algorithms.

For the case of "local" locking (i.e., the possibility to lock each list item separately), An idea would be to make the locks ordered. Ordering the locks ensures the impossibility of a deadlock. So your operations are like that:

Delete by the pointer p to the previous item:

lock p, check (using perhaps special flag in the item) that the item is still in the list
lock p->next, check that it's not zero and in the list; this way you ensure that the p->next->next won't be removed in the meantime
lock p->next->next
set a flag in p->next indicating that it's not in the list
(p->next->next->prev, p->next->prev) = (p, null); (p->next, p->next->next) = (p->next->next, null)
release the locks

Insert into the beginning:

lock head
set the flag in the new item indicating that it's in the list
lock the new item
lock head->next
(head->next->prev, new->prev) = (new, head); (new->next, head) = (head, new)
release the locks

This seems to be correct, I didn't however try this idea.

Essentially, this makes the double-linked list work as if it were a single-linked list.

If you don't have the pointer to the previous list element (which is of course usually the case, as it's virtually impossible to keep such a pointer in consistent state), you can do the following:

Delete by the pointer c to the item to be deleted:

lock c, check if it is still a part of the list (this has to be a flag in the list item), if not, operation fails
obtain pointer p = c->prev
unlock c (now, c may be moved or deleted by other thread, p may be moved or deleted from the list as well) [in order to avoid the deallocation of c, you need to have something like shared pointer or at least a kind of refcounting for list items here]
lock p
check if p is a part of the list (it could be deleted after step 3); if not, unlock p and restart from the beginning
check if p->next equals c, if not, unlock p and restart from the beginning [here we can maybe optimize out the restart, not sure ATM]
lock p->next; here you can be sure that p->next==c and is not deleted, because the deletion of c would have required locking of p
lock p->next->next; now all the locks are taken, so we can proceed
set the flag that c is not a part of the list
perform the customary (p->next, c->next, c->prev, c->next->prev) = (c->next, null, null, p)
release all the locks

Note that just having a pointer to some list item cannot ensure that the item is not deallocated, so you'll need to have a kind of refcounting, so that the item is not destroyed at the very moment you try to lock it.

Note that in the last algorithm the number of retries is bounded. Indeed, new items cannot appear on the left of c (insertion is at the rightmost position). If our step 5 fails and thus we need a retry, this can be caused only by having p removed from the list in the meanwhile. Such a removal can occur not more than N-1 times, where N is the initial position of c in the list. Of course, this worst case is rather unlikely to happen.

回复收藏 0 原文

飘然心甜 2024-10-12 06:35:54

请不要严厉地对待这个答案，但也不要这样做。

你几乎肯定会遇到错误，而且是很难发现的错误。使用 pthreads 锁原语。它们是您的朋友，并且是由深刻了解您选择的处理器提供的内存模型的人编写的。如果你尝试用 CAS 和原子增量等做同样的事情，你几乎肯定会犯一些微妙的错误，直到为时已晚。

这里有一个小代码示例来帮助说明这一点。这把锁有什么问题吗？

volatile int lockTaken = 0;

void EnterSpinLock() {
  while (!__sync_bool_compare_and_swap(&lockTaken, 0, 1) { /* wait */ }
}

void LeaveSpinLock() {
  lockTaken = 0;
}

答案是：释放锁时不存在内存屏障，这意味着在下一个线程进入锁之前，锁内执行的某些写操作可能还没有发生。哎呀！（可能还有更多错误，例如，该函数在自旋循环内没有执行适合平台的产量，因此极大地浪费了 CPU 周期。&c.）

如果您将双链表实现为一个带有哨兵节点的循环列表，那么您只需要执行两次指针赋值即可从列表中删除一个项目，四次即可添加一个项目。我确信您有能力对这些指针分配持有一个编写良好的排他锁。

请注意，我假设您不是少数几个深刻理解内存模型的人之一，只是因为世界上的内存模型很少。如果你是这些人中的一员，那么即使你也无法弄清楚这一事实应该表明它是多么棘手。 :)

我还假设您问这个问题是因为您有一些实际上想要工作的代码。如果这只是为了了解更多有关线程的知识而进行的学术练习（也许是成为深入的低级并发专家的一步），那么无论如何，请忽略我，并研究内存的细节您所定位的平台的型号。 :)

Please don't take this answer harshly, but don't do this.

You will almost certainly wind up with bugs, and very hard bugs to find at that. Use the pthreads lock primitives. They are your friends, and have been written by people who deeply understand the memory model provided by your processor of choice. If you try to do the same thing with CAS and atomic increment and the like, you will almost certainly make some subtle mistake that you won't find until it's far too late.

Here's a little code example to help illustrate the point. What's wrong with this lock?

volatile int lockTaken = 0;

void EnterSpinLock() {
  while (!__sync_bool_compare_and_swap(&lockTaken, 0, 1) { /* wait */ }
}

void LeaveSpinLock() {
  lockTaken = 0;
}

The answer is: there's no memory barrier when releasing the lock, meaning that some of the write operations executed within the lock may not have happened before the next thread gets into the lock. Yikes! (There are probably many more bugs too, for example, the function doesn't do the platform-appropriate yield inside the spin loop and so is hugely wasteful of CPU cycles. &c.)

If you implement your double-linked list as a circular list with a sentinal node, then you only need to perform two pointer assignments in order to remove an item from the list, and four to add an item. I'm sure you can afford to hold a well-written exclusive lock over those pointer assignments.

Note that I am assuming that you are not one of the few people who deeply understand memory models only because there are very few of them in the world. If you are one of these people, the fact that even you can't figure it out ought to be an indication of how tricky it is. :)

I am also assuming that you're asking this question because you have some code you'd actually like to get working. If this is simply an academic exercise in order to learn more about threading (perhaps as a step on your way to becoming a deep low-level concurrency expert) then by all means, ignore me, and do your research on the details of the memory model of the platform you're targeting. :)

回复收藏 0 原文

万水千山粽是情ミ 2024-10-12 06:35:54

如果您维护严格的锁层次结构，则可以避免死锁：如果您要锁定多个节点，请始终首先锁定靠近列表头部的节点。因此，要删除一个元素，首先锁定该节点的前驱节点，然后锁定该节点，然后锁定该节点的后继节点，取消该节点的链接，然后以相反的顺序释放锁定。

这样，如果多个线程尝试同时删除相邻节点（例如链 ABCD 中的节点 B 和 C），那么第一个获得节点 B 锁的线程将是第一个取消链接的线程。线程 1 将锁定 A，然后锁定 B，然后锁定 C，线程 2 将锁定 B，然后锁定 C，然后锁定 D。只有 B 的竞争，并且线程 1 在等待线程持有的锁定时无法持有锁定如图2所示，当线程2正在等待线程1持有的锁时（即死锁）。

回复收藏 0 原文

偏爱自由 2024-10-12 06:35:54

如果没有锁定整个列表，您就无法逃脱。原因如下：

插入空列表

线程 A 和 B 想要插入一个对象。

线程 A 检查列表，发现它是空的

。发生上下文切换。

线程 B 检查列表，发现它为空，并更新头部和尾部以指向其对象。

发生上下文切换，

线程 A 更新头部和尾部以指向其对象。线程 B 的对象已丢失。

从列表中间删除一个项目

线程 A 想要删除节点 X。为此，它首先必须锁定 X 的前驱节点、X 本身以及 X 的后继节点，因为所有这些节点都会受到操作的影响。要锁定 X 的前身，您必须执行类似的操作

spin_lock(&(X->prev->lockFlag));

虽然我使用了函数调用语法，但如果 spin_lock 是一个函数，那么您就已经死在水中了，因为在实际获得锁之前，这至少涉及三个操作：

将锁标志的地址放在堆栈上（或寄存器中）
调用函数
进行原子测试并设置

有两个地方可以交换线程A，另一个线程可以进入并删除X的前驱，而无需线程A 知道X 的前任已经改变。所以你必须以原子方式实现自旋锁本身。即，您必须向 X 添加一个偏移量以获得 x->prev，然后取消引用它以获得 *(x->prev) 并向其添加一个偏移量以获得 lockFlag，然后在一个原子单元中执行原子操作。否则，在您承诺锁定特定节点之后但在实际锁定它之前，总是有机会潜入某些东西。

You cannot get away without a lock for the whole list. Here's why:

Insert into an Empty List

Threads A and B wants to insert an object.

Thread A examines the list, finds it empty

A context switch occurs.

Thread B examines the list, finds it empty and updates the head and tail to point to its object.

A context switch occurs

Thread A updates the head and tail to point to its object. Thread B's object has been lost.

Delete an item from the middle of the list

Thread A wants to delete node X. For this it first has to lock X's predecessor, X itself and X's successor since all of these nodes will be affected by the operation. To lock X's predecessor you must do something like

spin_lock(&(X->prev->lockFlag));

Although I've used function call syntax, if spin_lock is a function, you are dead in the water because that involves at least three operations before you actually have the lock:

place the address of the lock flag on the stack (or in a register)
call the function
do the atomic test and set

There are two places there where thread A can be swapped out and another thread can get in and remove X's predecessor without thread A knowing that X's predecessor has changed. So you have to implement the spin lock itself atomically. i.e. you have to add an offset to X to get x->prev then dereference it to get *(x->prev) and add an offset to that to get lockFlag and then do an atomic operation all in one atomic unit. Otherwise there is always an opportunity for something to sneak in after you have committed to locking a particular node but before you have actually locked it.

回复收藏 0 原文