简单的定制互斥体失败
你能发现代码中的错误吗?门票最终降到 0 以下,导致长时间的摊位。
struct SContext {
volatile unsigned long* mutex;
volatile long* ticket;
volatile bool* done;
};
static unsigned int MyThreadFunc(SContext* ctxt) {
// -- keep going until we signal for thread to close
while(*ctxt->done == false) {
while(*ctxt->ticket) { // while we have tickets waiting
unsigned int lockedaquired = 0;
do {
if(*ctxt->mutex == 0) { // only try if someone doesn't have mutex locked
// -- if the compare and swap doesn't work then the function returns
// -- the value it expects
lockedaquired = InterlockedCompareExchange(ctxt->mutex, 1, 0);
}
} while(lockedaquired != 0); // loop while we didn't aquire lock
// -- enter critical section
// -- grab a ticket
if(*ctxt->ticket > 0);
(*ctxt->ticket)--;
// -- exit critical section
*ctxt->mutex = 0; // release lock
}
}
return 0;
}
调用函数等待线程完成
for(unsigned int loops = 0; loops < eLoopCount; ++loops) {
*ctxt.ticket = eNumThreads; // let the threads start!
// -- wait for threads to finish
while(*ctxt.ticket != 0)
;
}
done = true;
编辑:
这个问题的答案很简单,不幸的是,在我花时间修剪示例以发布简化版本后,我在发布问题后立即找到了答案。叹息..
我将 lockaquired 初始化为 0。然后,作为不占用总线带宽的优化,如果互斥体被占用,我不会执行 CAS。
不幸的是,在这种情况下,当锁定被获取时,while 循环将让第二个线程通过!
抱歉问了这个额外的问题。我以为我不理解 Windows 低级同步原语,但实际上我只是犯了一个简单的错误。
Can you spot the error in the code? tickets ends up going below 0 causing long stalls.
struct SContext {
volatile unsigned long* mutex;
volatile long* ticket;
volatile bool* done;
};
static unsigned int MyThreadFunc(SContext* ctxt) {
// -- keep going until we signal for thread to close
while(*ctxt->done == false) {
while(*ctxt->ticket) { // while we have tickets waiting
unsigned int lockedaquired = 0;
do {
if(*ctxt->mutex == 0) { // only try if someone doesn't have mutex locked
// -- if the compare and swap doesn't work then the function returns
// -- the value it expects
lockedaquired = InterlockedCompareExchange(ctxt->mutex, 1, 0);
}
} while(lockedaquired != 0); // loop while we didn't aquire lock
// -- enter critical section
// -- grab a ticket
if(*ctxt->ticket > 0);
(*ctxt->ticket)--;
// -- exit critical section
*ctxt->mutex = 0; // release lock
}
}
return 0;
}
Calling function waiting for threads to finish
for(unsigned int loops = 0; loops < eLoopCount; ++loops) {
*ctxt.ticket = eNumThreads; // let the threads start!
// -- wait for threads to finish
while(*ctxt.ticket != 0)
;
}
done = true;
EDIT:
The answer to this question is simple and unfortunately after I spent the time trimming down the example to post a simplified version I immediately find the answer after I post the question. Sigh..
I initialize lockaquired to 0. Then as an optimization to not take up bus bandwith I don't do the CAS if the mutex is taken.
Unfortunately, in that case where the lock is taken the while loop will let the second thread through!
Sorry for the extra question. I thought I didn't understand windows low level synchronization primitives but really I just had a simple mistake.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我在您的代码中看到了另一场竞赛:一个线程可能导致 *ctxt.ticket 达到 0,从而允许父循环返回并重新设置
*ctxt.ticket = eNumThreads
不持有*ctxt.mutex
。其他一些线程现在可能已经持有互斥体(事实上,它可能确实如此)并在 *ctxt.ticket 上进行操作。对于您的简化示例,这只会阻止“批次”被干净地分离,但是如果您在loops
循环的顶部有更复杂的初始化(如比单个单词写入更复杂),您可以看到奇怪的行为。I see another race in your code: One thread can cause
*ctxt.ticket
to hit 0, allowing the parent loop to go back and re-set*ctxt.ticket = eNumThreads
without holding*ctxt.mutex
. Some other thread may already now hold the mutex (in fact, it probably does) and operate on*ctxt.ticket
. For your simplified example this only prevents "batches" from being cleanly separated, but if you had more complex initialization (as in more complex than a single word write) at the top of theloops
loop you could see strange behavior.我发布了一个错误,我认为这是一个合法的多线程问题,但实际上这只是错误的逻辑。我一发布就解决了这个错误。这是问题行和答案,
我将 lockaquired 初始化为 0,然后在添加 if 语句以跳过执行 CAS 的昂贵操作之后。这种优化导致它脱离 while 循环并进入临界区。将代码更改为
修复问题。我还发现代码中还有另一个隐藏的问题(我真的不应该再在深夜编码了)。有人注意到关键部分 if 语句后面的分号吗?叹息...
应该是的
另外,Ben Jackson 指出,当我们将票证重置为 eNumThreads 时,线程可能会位于关键部分内。虽然这在此示例代码中完全没问题,但如果您要将其应用于需要执行更多操作的问题,则可能不安全,因为线程未按步调运行,因此如果您将其应用于您的应用程序,请记住这一点。代码。
最后一点,如果有人决定使用此代码来实现自己的互斥体,请记住您的主驱动程序线程正在空闲。如果您在关键部分执行需要大量时间的大型操作,并且您的票数很高,请考虑让出您的线程,让其他软件在等待时使用 CPU。另外,如果临界区很大,请考虑使用自旋锁。
谢谢
I posted a bug where I thought it was a legitimate multithreaded problem but really it was just bad logic. I solved the bug as soon as I posted. Here is the problem lines and answer
I initialized lockaquired to 0 and then after I added an if statement to skip the expensive operation of doing a CAS. This optimization caused it to fall out of the while loop and into the critical section. Changing the code to
Fixes the problem. There is another hidden problem in the code that I found as well(I really shouldn't code late at night anymore). Anyone notice the semicolon after the if statement in the critical section? Sigh...
That should be
Also, Ben Jackson pointed out that a thread probably will be inside the critical section when we reset the ticket to eNumThreads. While this is perfectly fine in this sample code if you were to apply it to a problem where you needed to do more operations it might not be safe because the threads aren't running in lockstep so keep that in mind if you apply this to your code.
A final note, if anyone does decide to use this code for their own implementation of a mutex please remember that your main driver thread is spinning idle. If you are doing a large operation in the critical section that takes a deal of time and your ticket count is high consider yielding your thread to let other software make use of the CPU while its waiting. Also, consider using a spin lock if the critical section is large.
Thank you