小型 OpenMP 程序有时会冻结(gcc、c、linux)

发布于 2024-10-08 07:50:56 字数 866 浏览 7 评论 0原文

只需编写一个小的 omp 测试,它就不会始终正常工作:

#include <omp.h>
int main() {
  int i,j=0;
#pragma omp parallel
  for(i=0;i<1000;i++)
  {
#pragma omp barrier
    j+= j^i;
  }
  return j;
}

在本示例中,使用 j 从所有线程写入是不正确的,但是

  • 必须只有不确定的值j

  • 我冻结了。

使用 gcc-4.3.1 -fopenmp ac -o gcc -static 编译

在 4 核 x86_Core2 Linux 服务器上运行:$ ./gcc 并冻结(有时;例如 1冻结 4-5 次快速运行)。

Strace:

[pid 13118] futex(0x80d3014, FUTEX_WAKE, 1) = 1
[pid 13119] <... futex resumed> )       = 0
[pid 13118] futex(0x80d3020, FUTEX_WAIT, 251, NULL <unfinished ...>
[pid 13119] futex(0x80d3014, FUTEX_WAKE, 1) = 0
[pid 13119] futex(0x80d3020, FUTEX_WAIT, 251, NULL                       
                        <freeze>

为什么我会出现冻结(死锁)?

Just write a small omp test, and it does not work correctly all the times:

#include <omp.h>
int main() {
  int i,j=0;
#pragma omp parallel
  for(i=0;i<1000;i++)
  {
#pragma omp barrier
    j+= j^i;
  }
  return j;
}

The usage of j for writing from all threads is incorrect in this example, BUT

  • there must be only nondeterministic value of j

  • I have a freeze.

Compiled with gcc-4.3.1 -fopenmp a.c -o gcc -static

Run on 4-core x86_Core2 Linux server: $ ./gcc and got freeze (sometimes; like 1 freeze for 4-5 fast runs).

Strace:

[pid 13118] futex(0x80d3014, FUTEX_WAKE, 1) = 1
[pid 13119] <... futex resumed> )       = 0
[pid 13118] futex(0x80d3020, FUTEX_WAIT, 251, NULL <unfinished ...>
[pid 13119] futex(0x80d3014, FUTEX_WAKE, 1) = 0
[pid 13119] futex(0x80d3020, FUTEX_WAIT, 251, NULL                       
                        <freeze>

Why do I have a freeze (deadlock)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

你げ笑在眉眼 2024-10-15 07:50:56

尝试将 i 设为私有,以便每个循环都有它自己的副本。

现在我有更多时间,我会尝试解释。默认情况下,OpenMP 中的变量是共享的。在某些情况下,默认情况下会将变量设为私有。并行区域不是其中之一(因此 High Performance Mark 的响应是错误的)。在您的原始程序中,您有两个竞争条件 - 一个在 i 上,一个在 j 上。问题出在 i 上的那个。每个线程都会执行循环一定次数,但由于每个线程都会更改 i,因此任何线程执行循环的次数是不确定的。由于所有线程都必须执行屏障才能满足屏障,因此您会想到这样的情况:您将在屏障上挂起,而该屏障永远不会结束,因为并非所有线程都会执行相同的次数。

由于 OpenMP 规范明确指出(OMP 规范 V3.0,第 2.8.3 节屏障构造)“遇到的工作共享区域和屏障区域的顺序必须是
对于团队中的每个线程都是相同的”,您的程序不合规,因此可能具有不确定的行为。

Try making i private so each loop has it's own copy.

Now that I have more time, I will try and explain. By default variables in OpenMP are shared. There are a couple of cases where there are defaults that make variables private. Parallel regions is not one of them (so High Performance Mark's response is wrong). In your original program, you have two race conditions - one on i and one on j. The problem is with the one on i. Each thread will execute the loop some number of times, but since i is being changed by each thread, the number of times any thread executes the loop is indeterminate. Since all threads have to execute the barrrier for the barrier to be satisfied, you come up with the case where you will get a hang on the barrier which will never end, since not all threads will execute it the same number of times.

Since the OpenMP spec clearly states (OMP spec V3.0, section 2.8.3 barrier Construct) that "the sequence of worksharing regions and barrier regions encountered must be the
same for every thread in a team", your program is non-compliant and as such can have indeterminate behavior.

薆情海 2024-10-15 07:50:56

您正在尝试从多个线程添加到同一位置。你无法并行地做你想做的事情。如果你想并行求和,你需要将其分成更小的部分,然后收集它们。

a5b 更新:正确的想法,但发现了错误的代码部分。 i 变量由两个线程更改。

You're trying to add to the same location from multiple threads. You can't do what you're trying to do in parallel. If you want to do a sum in parallel, you need to divide it into smaller pieces and collect them afterwards.

Update by a5b: right idea but wrong part of code was spotted. The i variable is changed by both threads.

遇见了你 2024-10-15 07:50:56

@ejd,如果我将 i 标记为私有,我的程序是否合规?

抱歉 - 我刚刚看到这个问题。从技术上讲,如果您将变量“i”标记为私有,您的程序将符合 OpenMP 标准。然而,“j”上仍然存在竞争条件,尽管您的程序符合要求(因为存在存在竞争条件的有效情况),但“j”的值未指定(根据 OpenMP 规范)。

在您之前的回答之一中,您说过您正在尝试衡量屏障实施的速度。您可能需要查看多个“基准”,它们已发布各种 OpenMP 构造的结果。其中一份由 Mark Bull(EPCC、爱丁堡大学)编写,另一份 (Sphinx) 来自劳伦斯利弗莫尔国家实验室 (LLNL),第三份 (Parkbench) 来自日本计算合作伙伴。他们可能会为您提供一些指导。

@ejd, If I mark i as private, will my program be compliant?

Sorry - I just saw this question. Technically if you mark variable "i" as private your program will be OpenMP compliant. HOWEVER, there is still a race condition on "j" and while your program is compliant (because there are valid cases to have race conditions), the value of "j" is unspecified (according to the OpenMP spec).

In one of your previous answers you said that you were trying to measure the speed of the barrier implementation. There are several "benchmarks" that you might want to look at that have published results for a variety of OpenMP constructs. One was written by Mark Bull (EPCC, University of Edinburgh), another (Sphinx) comes from Lawrence Livermore National Labs (LLNL), and the third (Parkbench) comes from a Japanese Computing Partnership. They may offer you some guidance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文