条件变量的这种使用是否总是会受到信号丢失竞争的影响?
假设在信号线程修改影响谓词真值的状态并调用 pthread_cond_signal 而不持有与条件变量关联的互斥体的情况下使用条件变量?这种类型的使用是否总是受到可能会丢失信号的竞争条件的影响?
对我来说,似乎总是存在明显的竞争:
- Waiter 将谓词评估为 false,但在它开始等待之前...
- 另一个线程以某种方式更改状态,使谓词为 true。
- 另一个线程调用 pthread_cond_signal,它不执行任何操作,因为还没有等待者。
- 等待线程进入 pthread_cond_wait ,不知道谓词现在为真,并无限期等待。
但是,如果情况发生变化,这种相同类型的竞争条件是否总是存在,以便 (A) 在调用 pthread_cond_signal 时保留互斥体,而不是在更改状态时,或者 (B) 以便更改状态时会保持互斥锁,而不是在调用 pthread_cond_signal 时保持互斥锁?
我从想知道上述非最佳实践用法是否有任何有效用途的角度出发,即正确的条件变量实现是否需要考虑这种用法以避免竞争条件本身,或者是否可以忽略它们,因为它们本质上已经很活泼了。
Suppose a condition variable is used in a situation where the signaling thread modifies the state affecting the truth value of the predicate and calls pthread_cond_signal
without holding the mutex associated with the condition variable? Is it true that this type of usage is always subject to race conditions where the signal may be missed?
To me, there seems to always be an obvious race:
- Waiter evaluates the predicate as false, but before it can begin waiting...
- Another thread changes state in a way that makes the predicate true.
- That other thread calls
pthread_cond_signal
, which does nothing because there are no waiters yet. - The waiter thread enters
pthread_cond_wait
, unaware that the predicate is now true, and waits indefinitely.
But does this same kind of race condition always exist if the situation is changed so that either (A) the mutex is held while calling pthread_cond_signal
, just not while changing the state, or (B) so that the mutex is held while changing the state, just not while calling pthread_cond_signal
?
I'm asking from a standpoint of wanting to know if there are any valid uses of the above not-best-practices usages, i.e. whether a correct condition-variable implementation needs to account for such usages in avoiding race conditions itself, or whether it can ignore them because they're already inherently racy.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这里的基本竞争如下所示:
如果我们对状态变化或信号或两者都进行锁定,那么我们就可以避免这种情况;当线程 A 处于其临界区并持有锁时,状态更改和信号不可能同时发生。
如果我们考虑相反的情况,即线程 A 交错到线程 B 中,则没有问题:
因此线程 B 没有特别需要在整个操作中持有互斥体;它只需要在状态变化和信号之间保持互斥体一段可能无限小的间隔。当然,如果状态本身需要锁定以进行安全操作,则也必须在状态更改时保持锁定。
最后,请注意,在大多数情况下,尽早删除互斥体不太可能提高性能。要求保持互斥体可以减少对条件变量中内部锁的争用,并且在现代 pthread 实现中,系统可以将等待线程从等待 cvar '移动' 为等待互斥体而不唤醒它(从而避免它醒来后立即阻止互斥体)。正如评论中所指出的,在某些情况下,删除互斥体可能会通过减少所需的系统调用数量来提高性能。然而,它也可能导致对条件变量的内部互斥体的额外争用。很难说。无论如何,这可能都不值得担心。
请注意,适用标准要求
pthread_cond_signal
无需持有互斥体即可安全调用:这通常意味着条件变量对其内部数据结构有内部锁,或者使用一些非常仔细的无锁算法。
The fundamental race here looks like this:
If we take a lock EITHER on the state change OR the signal, OR both, then we avoid this; it's not possible for both the state-change and the signal to occur while thread A is in its critical section and holding the lock.
If we consider the reverse case, where thread A interleaves into thread B, there's no problem:
So there's no particular need for thread B to hold a mutex over the entire operation; it just need to hold the mutex for some, possible infinitesimally small interval, between the state change and signal. Of course, if the state itself requires locking for safe manipulation, then the lock must be held over the state change as well.
Finally, note that dropping the mutex early is unlikely to be a performance improvement in most cases. Requiring the mutex to be held reduces contention over the internal locks in the condition variable, and in modern pthreads implementations, the system can 'move' the waiting thread from waiting on the cvar to waiting on the mutex without waking it up (thus avoiding it waking up only to immediately block on the mutex).As pointed out in the comments, dropping the mutex may improve performance in some cases, by reducing the number of syscalls needed. Then again it could also lead to extra contention on the condition variable's internal mutex. Hard to say. It's probably not worth worrying about in any case.
Note that the applicable standards require that
pthread_cond_signal
be safely callable without holding the mutex:This usually means that condition variables have an internal lock over their internal data structures, or otherwise use some very careful lock-free algorithm.
必须在互斥锁内修改状态,如果没有其他原因,除了可能出现虚假唤醒,这将导致读者在编写者正在编写状态时读取状态。
状态更改后,您可以随时调用
pthread_cond_signal
。它不必位于互斥体内部。 POSIX 保证至少有一名服务员会被唤醒来检查新状态。更重要的是:编辑: @DietrichEpp 在评论中提出了很好的观点。编写者必须以这样的方式更改状态,以使读者永远无法访问不一致的状态。正如我上面指出的,它可以通过获取条件变量中使用的互斥体来实现这一点,或者通过确保所有状态更改都是原子的。
The state must be modified inside a mutex, if for no other reason than the possibility of spurious wake-ups, which would lead to the reader reading the state while the writer is in the middle of writing it.
You can call
pthread_cond_signal
anytime after the state is changed. It doesn't have to be inside the mutex. POSIX guarantees that at least one waiter will awaken to check the new state. More to the point:pthread_cond_signal
doesn't guarantee that a reader will acquire the mutex first. Another writer might get in before a reader gets a chance to check the new status. Condition variables don't guarantee that readers immediately follow writers (After all, what if there are no readers?)EDIT: @DietrichEpp makes a good point in the comments. The writer must change the state in such a way that the reader can never access an inconsistent state. It can do so either by acquiring the mutex used in the condition-variable, as I indicate above, or by ensuring that all state-changes are atomic.
答案是,存在竞争,为了消除竞争,您必须这样做:
数据的保护并不重要,因为无论如何在调用 pthread_cond_signal 时您都不会持有互斥锁。
看,通过锁定和解锁互斥体,您已经创建了一个屏障。在信号发送者拥有互斥锁的那一刻,可以肯定的是:没有其他线程拥有互斥锁。这意味着没有其他线程正在执行任何关键区域。
这意味着所有线程要么即将获取互斥体以发现您已发布的更改,要么它们已经发现该更改并随之运行(释放互斥体),要么尚未找到它们正在寻找的并且已找到原子地放弃互斥体进入睡眠状态(并且保证在条件下很好地等待)。
如果没有互斥锁/解锁,就没有同步。当没有看到更改的原子值的线程转换到原子睡眠状态以等待它时,有时会触发该信号。
从发出信号的线程的角度来看,这就是互斥量的作用。您可以从其他地方获得访问的原子性,但不能获得同步。
PS 我之前已经实现过这个逻辑。这种情况发生在 Linux 内核中(使用我自己的互斥体和条件变量)。
在我的情况下,信号发送者不可能持有共享数据原子操作的互斥体。为什么?因为信号发送器在用户空间、内核和用户之间共享的缓冲区内执行操作,然后(在某些情况下)对内核进行系统调用以唤醒线程。用户空间只需对缓冲区进行一些修改,然后如果满足某些条件,它将执行
ioctl
。因此,在 ioctl 调用中,我执行了互斥锁/解锁操作,然后命中条件变量。这确保了线程不会错过与用户空间发布的最新修改相关的唤醒。
起初我只有条件变量信号,但在没有互斥锁参与的情况下看起来是错误的,所以我对这种情况进行了一些推理,并意识到必须简单地锁定和解锁互斥锁以符合同步仪式,从而消除了失去唤醒。
The answer is, there is a race, and to eliminate that race, you must do this:
The protection of the data doesn't matter, because you don't hold the mutex when calling
pthread_cond_signal
anyway.See, by locking and unlocking the mutex, you have created a barrier. During that brief moment when the signaler has the mutex, there is a certainty: no other thread has the mutex. This means no other thread is executing any critical regions.
This means that all threads are either about to get the mutex to discover the change you have posted, or else they have already found that change and ran off with it (releasing the mutex), or else have not found they are looking for and have atomically given up the mutex to gone to sleep (and are guaranteed to be waiting nicely on the condition).
Without the mutex lock/unlock, you have no synchronization. The signal will sometimes fire as threads which didn't see the changed atomic value are transitioning to their atomic sleep to wait for it.
So this is what the mutex does from the point of view of a thread which is signaling. You can get the atomicity of access from something else, but not the synchronization.
P.S. I have implemented this logic before. The situation was in the Linux kernel (using my own mutexes and condition variables).
In my situation, it was impossible for the signaler to hold the mutex for the atomic operation on shared data. Why? Because the signaler did the operation in user space, inside a buffer shared between the kernel and user, and then (in some situations) made a system call into the kernel to wake up a thread. User space simply made some modifications to the buffer, and then if some conditions were satisfied, it would perform an
ioctl
.So in the
ioctl
call I did the mutex lock/unlock thing, and then hit the condition variable. This ensured that the thread would not miss the wake up related to that latest modification posted by user space.At first I just had the condition variable signal, but it looked wrong without the involvement of the mutex, so I reasoned about the situation a little bit and realized that the mutex must simply be locked and unlocked to conform to the synchronization ritual which eliminates the lost wakeup.