ReleaseSemaphore 不释放信号量
(简而言之:main() 的 WaitForSingleObject 在下面的程序中挂起)。
我正在尝试编写一段代码来调度线程并等待它们完成后再恢复。我没有每次都创建线程(成本高昂),而是让它们休眠。主线程创建 X 个处于 CREATE_SUSPENDED 状态的线程。
同步是通过 X 作为 MaximumCount 的信号量完成的。信号量的计数器被清零并调度线程。线程在进入睡眠状态之前执行一些愚蠢的循环并调用 ReleaseSemaphore。然后主线程使用 WaitForSingleObject X 次来确保每个线程完成其工作并正在睡眠。然后它会循环并再次执行所有操作。
有时程序不会退出。当我关闭程序时,我可以看到 WaitForSingleObject 挂起。这意味着某个线程的ReleaseSemaphore不起作用。没有打印任何内容,所以据说没有任何问题。
也许两个线程不应该在完全相同的时间调用 ReleaseSemaphore,但这会使信号量的目的失效......
我只是不明白......
其他同步线程的解决方案被感激地接受!
#define TRY 100
#define LOOP 100
HANDLE *ids;
HANDLE semaphore;
DWORD WINAPI Count(__in LPVOID lpParameter)
{
float x = 1.0f;
while(1)
{
for (int i=1 ; i<LOOP ; i++)
x = sqrt((float)i*x);
while (ReleaseSemaphore(semaphore,1,NULL) == FALSE)
printf(" ReleaseSemaphore error : %d ", GetLastError());
SuspendThread(ids[(int) lpParameter]);
}
return (DWORD)(int)x;
}
int main()
{
SYSTEM_INFO sysinfo;
GetSystemInfo( &sysinfo );
int numCPU = sysinfo.dwNumberOfProcessors;
semaphore = CreateSemaphore(NULL, numCPU, numCPU, NULL);
ids = new HANDLE[numCPU];
for (int j=0 ; j<numCPU ; j++)
ids[j] = CreateThread(NULL, 0, Count, (LPVOID)j, CREATE_SUSPENDED, NULL);
for (int j=0 ; j<TRY ; j++)
{
for (int i=0 ; i<numCPU ; i++)
{
if (WaitForSingleObject(semaphore,1) == WAIT_TIMEOUT)
printf("Timed out !!!\n");
ResumeThread(ids[i]);
}
for (int i=0 ; i<numCPU ; i++)
WaitForSingleObject(semaphore,INFINITE);
ReleaseSemaphore(semaphore,numCPU,NULL);
}
CloseHandle(semaphore);
printf("Done\n");
getc(stdin);
}
(In short: main()'s WaitForSingleObject hangs in the program below).
I'm trying to write a piece of code that dispatches threads and waits for them to finish before it resumes. Instead of creating the threads every time, which is costly, I put them to sleep. The main thread creates X threads in CREATE_SUSPENDED state.
The synch is done with a semaphore with X as MaximumCount. The semaphore's counter is put down to zero and the threads are dispatched. The threds perform some silly loop and call ReleaseSemaphore before they go to sleep. Then the main thread uses WaitForSingleObject X times to be sure every thread finished its job and is sleeping. Then it loops and does it all again.
From time to time the program does not exit. When I beak the program I can see that WaitForSingleObject hangs. This means that a thread's ReleaseSemaphore did not work. Nothing is printf'ed so supposedly nothing went wrong.
Maybe two threads shouldn't call ReleaseSemaphore at the exact same time, but that would nullify the purpose of semaphores...
I just don't grok it...
Other solutions to synch threads are gratefully accepted!
#define TRY 100
#define LOOP 100
HANDLE *ids;
HANDLE semaphore;
DWORD WINAPI Count(__in LPVOID lpParameter)
{
float x = 1.0f;
while(1)
{
for (int i=1 ; i<LOOP ; i++)
x = sqrt((float)i*x);
while (ReleaseSemaphore(semaphore,1,NULL) == FALSE)
printf(" ReleaseSemaphore error : %d ", GetLastError());
SuspendThread(ids[(int) lpParameter]);
}
return (DWORD)(int)x;
}
int main()
{
SYSTEM_INFO sysinfo;
GetSystemInfo( &sysinfo );
int numCPU = sysinfo.dwNumberOfProcessors;
semaphore = CreateSemaphore(NULL, numCPU, numCPU, NULL);
ids = new HANDLE[numCPU];
for (int j=0 ; j<numCPU ; j++)
ids[j] = CreateThread(NULL, 0, Count, (LPVOID)j, CREATE_SUSPENDED, NULL);
for (int j=0 ; j<TRY ; j++)
{
for (int i=0 ; i<numCPU ; i++)
{
if (WaitForSingleObject(semaphore,1) == WAIT_TIMEOUT)
printf("Timed out !!!\n");
ResumeThread(ids[i]);
}
for (int i=0 ; i<numCPU ; i++)
WaitForSingleObject(semaphore,INFINITE);
ReleaseSemaphore(semaphore,numCPU,NULL);
}
CloseHandle(semaphore);
printf("Done\n");
getc(stdin);
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我一直使用线程安全队列,而不是使用信号量(至少直接使用)或让 main 显式唤醒线程来完成某些工作。当 main 想要一个工作线程做某事时,它会将要完成的作业的描述推送到队列中。每个工作线程只做一项工作,然后尝试从队列中弹出另一个工作,并最终挂起,直到队列中有一个工作可供它们执行:
队列的代码如下所示:
大致相当于您的线程中使用它的代码看起来像这样。我没有弄清楚你的线程函数在做什么,但它是求和平方根的东西,显然你目前对线程同步比线程实际做什么更感兴趣。
编辑:(根据评论):
如果您需要 main() 等待某些任务完成,做更多工作,然后分配更多任务,通常最好通过将事件(例如)放入每个任务中来处理,并且让您的线程函数设置事件。修改后的代码如下所示(请注意,队列代码不受影响):
Instead of using a semaphore (at least directly) or having main explicitly wake up a thread to get some work done, I've always used a thread-safe queue. When main wants a worker thread to do something, it pushes a description of the job to be done onto the queue. The worker threads each just do a job, then try to pop another job from the queue, and end up suspended until there's a job in the queue for them to do:
The code for the queue looks like this:
And a rough equivalent of your code in the threads to use it looks something like this. I didn't sort out exactly what your thread function was doing, but it was something with summing square roots, and apparently you're more interested in the thread synch than what the threads actually do, for the moment.
Edit: (based on comment):
If you need
main()
to wait for some tasks to finish, do some more work, then assign more tasks, it's generally best to handle that by putting an event (for example) into each task, and have your thread function set the events. Revised code to do that would look like this (note that the queue code isn't affected):问题发生在以下情况下:
主线程恢复工作线程:
工作线程完成工作并释放信号量:
主线程等待所有工作线程并重置信号量:
主线程进入下一轮,尝试恢复工作线程(请注意,工作线程还没有挂起自己!这就是问题开始的地方......您正在尝试恢复不一定挂起的线程):
最后,工作线程挂起自己(尽管他们应该已经开始下一轮):
并且主线程永远等待,因为所有工作人员现在都已暂停:
这里有一个链接,显示如何正确解决生产者/消费者问题:
http://en.wikipedia.org/wiki/Producer-consumer_problem
我也认为关键部分比信号量和互斥体快得多。在大多数情况下它们也更容易理解(imo)。
the problem happens in the following case:
the main thread resumes the worker threads:
the worker threads do their work and release the semaphore:
the main thread waits for all worker threads and resets the semaphore:
the main thread goes into the next round, trying to resume the worker threads (note that the worker threads haven't event suspended themselves yet! this is where the problem starts... you are trying to resume threads that aren't necessarily suspended yet):
finally the worker threads suspend themselves (although they should already start the next round):
and the main thread waits forever since all workers are suspended now:
here's a link that shows how to correctly solve producer/consumer problems:
http://en.wikipedia.org/wiki/Producer-consumer_problem
also i think critical sections are much faster than semaphores and mutexes. they're also easier to understand in most cases (imo).
我不明白代码,但线程同步肯定很糟糕。您假设线程将按特定顺序调用 SuspendThread()。成功的 WaitForSingleObject() 调用不会告诉您哪个线程调用了 ReleaseSemaphore()。因此,您将在未挂起的线程上调用 ReleaseThread()。这很快使程序陷入僵局。
另一个错误的假设是,在 WFSO 返回后,线程已经调用了 SuspendThread。通常是的,但并非总是如此。该线程可以在 RS 调用之后立即被抢占。您将再次在未挂起的线程上调用 ReleaseThread()。这通常需要一天左右的时间才能让你的程序陷入僵局。
我认为 ReleaseSemaphore 调用太多了。毫无疑问,试图解开它。
您无法使用 Suspend/ReleaseThread() 控制线程,请勿尝试。
I don't understand the code, but the threading sync is definitely bad. You assume that threads will call SuspendThread() in a certain order. A succeeded WaitForSingleObject() call doesn't tell you which thread called ReleaseSemaphore(). You'll thus call ReleaseThread() on a thread that wasn't suspended. This quickly deadlocks the program.
Another bad assumption is that a thread already called SuspendThread after the WFSO returned. Usually yes, not always. The thread could be pre-empted right after the RS call. You'll again call ReleaseThread() on a thread that wasn't suspended. That one usually takes a day or so to deadlock your program.
And I think there's one ReleaseSemaphore call too many. Trying to unwedge it, no doubt.
You cannot control threading with Suspend/ReleaseThread(), don't try.
问题是你等待的次数多于发出信号的次数。
for (int j=0 ; j
我认为以下(未经测试的)更改会有所帮助。
将信号量初始化为零计数:
摆脱线程恢复循环中的等待(即删除以下内容):
从 try 循环末尾删除无关信号(即删除以下内容):
The problem is that you are waiting more often than you are signaling.
The
for (int j=0 ; j<TRY ; j++)
loop waits eight times for the semaphore, while the four threads will only signal once each and the loop itself signals it once. The first time through the loop, this is not an issue of because the semaphore is given an initial count of four. The second and each subsequent time, you are waiting for too many signals. This is mitigated by the fact that on the first four waits you limit the time and don't retry on error. So sometimes it may work and sometimes your wait will hang.I think the following (untested) changes will help.
Initialize the semaphore to zero count:
Get rid of the wait in the thread resumption loop (i.e. remove the following):
Remove the extraneous signal from the end of the try loop (i.e. remove the following):
这是一个实用的解决方案。
我希望我的主程序使用线程(然后使用多个核心)来处理作业并等待所有线程完成,然后再恢复并执行其他操作。我不想让线程死亡并创建新线程,因为这很慢。在我的问题中,我试图通过暂停线程来做到这一点,这似乎很自然。但正如 nobugz 指出的那样,“你可以使用 Suspend/ReleaseThread() 控制线程”。
该解决方案涉及信号量,就像我用来控制线程的信号量一样。实际上多了一个信号量来控制主线程。现在我每个线程有一个信号量来控制线程,还有一个信号量来控制主线程。
这是解决方案:
Here is a practical solution.
I wanted my main program to use threads (then using more than one core) to munch jobs and wait for all the threads to complete before resuming and doing other stuff. I did not want to let the threads die and create new ones because that's slow. In my question, I was trying to do that by suspending the threads, which seemed natural. But as nobugz pointed out, "Thou canst control threading with Suspend/ReleaseThread()".
The solution involves semaphores like the one I was using to control the threads. Actually one more semaphore is used to control the main thread. Now I have one semaphore per thread to control the threads and one semaphore to control the main.
Here is the solution: