Intel TBB 使用task_group 进行双调排序
我正在使用英特尔 TBB 实现双调排序。当使用parallel_invoke方法时,一切顺利。但是当使用task_group(不调用wait方法)时,输出没有排序。当如下使用task_group时,程序不会终止。
void bitonic_merge(bool up, int array[],int size){
if(size==1){
return;
}
int m = greatestPowerOfTwoLessThan(size);
bitonic_compare(up, array, size - m, m);
g->run(Bitonic_Merge(up, array , m));
g->run(Bitonic_Merge(up, &array[m],size - m));
g->wait();
return;
}
有人能看出出了什么问题吗? parallel_invoke 和使用 task_group 有什么区别?还有在这种情况下用什么比较好? parallel_invoke 还是task_group?或者我应该使用其他方法?
提前致谢
I'm implementing a bitonic sort using intel TBB. When using the parallel_invoke method all goes well. But when using task_group(without calling the wait method), the output is not sorted. When using task_group as below the program does not terminate.
void bitonic_merge(bool up, int array[],int size){
if(size==1){
return;
}
int m = greatestPowerOfTwoLessThan(size);
bitonic_compare(up, array, size - m, m);
g->run(Bitonic_Merge(up, array , m));
g->run(Bitonic_Merge(up, &array[m],size - m));
g->wait();
return;
}
Can someone see what is wrong? What is the difference between parallel_invoke and using task_group? Also what is better to use in such a situation? The parallel_invoke or task_group? Or should I use some other method?
Thanks in advance
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
程序不会终止,因为它陷入了死锁。您的代码非常接近正确,但问题是“g”是指向 task_group 的全局指针,并且您进行递归任务分解,这不是一个可以很好混合的组合。
如果您闯入调试器,我希望您会在 task_group::wait 中看到大量线程,等待任务完成。
这些任务未完成,因为您在线程和任务之间共享您的task_group,并且它们都有效地相互等待。
要解决此问题,请在 bitonic_merge 函数内的堆栈上声明一个 task_group (或 Structured_task_group),这仍然允许在调用等待期间调度和执行任务,就像使用 parallel_invoke 一样,但因为 task_group 不在之间共享任务中,对 wait 的调用将在所有子任务完成后完成,避免死锁。
请注意,我回答了 类似的问题PPL 的 msdn 论坛上的性能倾向,请记住,task_group、structured_task_group、parallel_invoke 和 parallel_for / parallel_for_each 的语法和语义在 PPL 和 TBB 之间是一致的;使用对您或您的平台有意义的内容。
The program doesn't terminate because it is deadlocked. Your code is very close to correct, but the problem is that 'g' is a global pointer to a task_group and your doing a recursive task decomposition and this is not a combination that mixes well.
If you broke into the debugger, I expect that you would see lots of threads in task_group::wait, waiting for the tasks to complete.
The tasks aren't completing because you're sharing your task_group amongst the threads and tasks and they're all effectively waiting for each other.
To fix this, declare a task_group (or structured_task_group) on the stack inside the bitonic_merge function, this will still allow tasks to be scheduled and executed during the calls to wait, just like using parallel_invoke will, but because the task_group isn't shared amongst tasks, the call to wait will complete after all the child tasks have completed and avoid the deadlock.
Note that I answered a similar question with a performance slant on the msdn forums for the PPL and remember that syntax and semantics of task_group, structured_task_group, parallel_invoke and parallel_for / parallel_for_each are consistent between the PPL and TBB; use what makes sense for you or your platform.
如果子问题的数量恒定,请使用 tbb::parallel_invoke。
否则使用递归和task_group。
由于子问题的数量是2,parallel_invoke是合适的并且更容易实现。
有关更多详细信息,请参阅英特尔 TBB 设计模式
Use tbb::parallel_invoke if the number of sub problems is constant.
otherwise use recursion and task_group.
Since the number of sub problems is 2 parallel_invoke is suitable and easier to implement.
Refer Intel TBB design patterns for more details
等待任务组在这里很重要。如果没有 wait(),函数将在使用 task_group::run() 完成递归“调用”之前返回,显然它会破坏算法。
parallel_invoke确实适用,它会自动等待“调用”的函数完成,因此更容易使用。
让我(作为 TBB 开发人员)担心的是为什么给定的程序片段不会终止。据我所知,它应该运作良好。您介意提交该程序的完整源代码吗(在此处或在 TBB 论坛?)
Waiting on the task group is important here. Without wait(), the function will return before the recursive "calls" done with task_group::run() complete, and obviously it breaks the algorithm.
parallel_invoke is indeed applicable, and it automatically waits for the "invoked" functions to complete, so easier to use.
What makes me (as TBB developer) worry is why the given program snippet does not terminate. It should work well as far as I can tell. Would you mind submitting a full source of the program (either here or at the TBB forum?)