如何分析 OpenMP 瓶颈

发布于 2024-12-01 03:46:33 字数 140 浏览 1 评论 0原文

我有一个已由 OpenMP 并行化的循环,但由于任务的性质,有 4 个关键子句。

分析速度并找出哪些关键子句(或者可能是非关键子句(!))在循环内占用最多时间的最佳方法是什么?

我使用 Ubuntu 10.04 和 g++ 4.4.3

I have a loop that has been parallelized by OpenMP, but due to the nature of the task, there are 4 critical clauses.

What would be the best way to profile the speed up and find out which of the critical clauses (or maybe non-critical(!) ) take up the most time inside the loop?

I use Ubuntu 10.04 with g++ 4.4.3

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夏有森光若流苏 2024-12-08 03:46:33

Scalasca 是一个很好的工具,用于分析 OpenMP(和 MPI)代码并分析结果。 Tau 也非常好,但更难使用。英特尔工具,例如 vtune,也很好但是很贵。

Scalasca is a nice tool for profiling OpenMP (and MPI) codes and analyzing the results. Tau is also very nice but much harder to use. The intel tools, like the vtune, are also good but very expensive.

三五鸿雁 2024-12-08 03:46:33

Arm MAP 具有 OpenMP 和 pthreads分析 - 无需检测或修改源代码即可工作。您可以看到同步问题以及线程在源行级别上花费时间的位置。 OpenMP 分析博客条目是值得一读。

MAP 广泛用于高性能计算,因为它还分析 MPI 等多进程应用程序。

Arm MAP has OpenMP and pthreads profiling - and works without needing to instrument or modify your source code. You can see synchronization issues and where threads are spending time to the source line level. The OpenMP profiling blog entry is worth reading.

MAP is widely used for high performance computing as it is also profiles multiprocess applications such as MPI.

人疚 2024-12-08 03:46:33

OpenMP 包括用于测量时序性能的函数 omp_get_wtime() 和 omp_get_wtick() (文档在这里),我建议使用这些。

否则尝试分析器。我更喜欢 Google CPU 分析器,可以在此处找到它。

还有 中描述的手动方式这个答案。

OpenMP includes the functions omp_get_wtime() and omp_get_wtick() for measuring timing performance (docs here), I would recommend using these.

Otherwise try a profiler. I prefer the google CPU profiler which can be found here.

There is also the manual way described in this answer.

肩上的翅膀 2024-12-08 03:46:33

还有 ompP 工具,我在过去十次中使用过多次年。我发现它对于识别和量化负载不平衡和并行/串行区域非常有用。该网页现在似乎已关闭,但我今年早些时候还在网络存档中找到了它。

编辑:更新主目录

There is also the ompP tool which I have used a number of times in the last ten years. I have found it to be really useful to identify and quantify load imbalance and parallel/serial regions. The web page seems to be down now but I also found it on web archive earlier this year.

edit: updated home directory

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文