为什么我的 OpenMP 实现比单线程实现慢? (后续)
这是为什么是我的 OpenMP 的后续内容实现比单线程实现慢? 。
我坚持提供的答案,并使用任务而不是编译指示来加速代码。但是,与顺序(相同)程序相比,两个程序的运行速度相同。我见证没有加速。
修改后的代码在这里: http://pastebin.com/3SFaNEc4
我只是删除了所有 for pragmas 并替换它为递归过程分配指令。
我做错了什么吗?我应该看到几乎线性的加速。你们觉得怎么样?
谢谢!
This is a follow up to Why is my OpenMP implementation slower than a single threaded implementation? .
I have adhered to the answer provided, and used tasking instead of for pragmas to speed up the code. However, compared to a sequential (same) program, both programs run equally as fast. I witness no speed up.
The reworked code is here: http://pastebin.com/3SFaNEc4
I simply removed all the for pragmas and replaced it tasking pragmas for the recursive procedures.
Am I doing anything wrong? I should be seeing an almost linear speed up. What do you guys think?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
首先 - 您仍然有一个“#pragma end critical”,应该将其删除。它不会造成问题,但它是不正确的。其次 - 正如我在您发布的另一个问题中所说,您可能必须考虑如何并行化代码才能看到加速,因此仅用任务编译指示替换其他编译指示可能不会加快速度。第三 - 您还没有将任务放入并行区域,因此您根本没有并行运行。而且您不能只在任务周围添加并行区域,否则您将多次执行相同的任务。
First - you still have an "#pragma end critical" which should be removed. It isn't causing a problem, but it is incorrect. Second - as I said in the other question you posted, you might have to think about how you are parallelizing the code to see the speedup, so just replacing the other pragmas with task pragmas may not speed it up. Third - you haven't put the tasks into a parallel region, so you are not running in parallel at all. And you can't just add a parallel region around the tasks or you are going to be doing the same tasks multiple times.