线程之间的openMP如何与私人变量进行交流?
我正在使用OpenMP在C ++中编写一些代码,以并行化一些块。我遇到了一些我无法完全解释的奇怪行为。我重写了我的代码,以使问题最少复制。
首先,这是我写的一个函数,该函数应在平行区域运行。
void foo()
{
#pragma omp for
for (int i = 0; i < 3; i++)
{
#pragma omp critical
printf("Hello %d from thread %d.\n", i, omp_get_thread_num());
}
}
那是我的整个程序。
int main()
{
omp_set_num_threads(4);
#pragma omp parallel
{
for (int i = 0; i < 2; i++)
{
foo();
#pragma omp critical
printf("%d\n", i);
}
}
return 0;
}
当我编译并运行此代码(使用G ++ -STD = C ++ 17)时,我在终端上获得以下输出:
Hello 0 from thread 0.
Hello 1 from thread 1.
Hello 2 from thread 2.
0
0
Hello 2 from thread 2.
Hello 1 from thread 1.
0
Hello 0 from thread 0.
0
1
1
1
1
i
是一个私有变量。我希望函数foo
每个线程将运行两次。因此,我希望在终端中看到八个“从%d thread%d。\ n”语句中看到八个“ Hello from%d thread%d。那么什么给这里呢?为什么在相同的循环中,OMP的行为如此不同?
I'm writing some code in C++ using OpenMP to parallelize some chunks. I run into some strange behavior that I can't quite explain. I've rewritten my code such that it replicates the issue minimally.
First, here is a function I wrote that is to be run in a parallel region.
void foo()
{
#pragma omp for
for (int i = 0; i < 3; i++)
{
#pragma omp critical
printf("Hello %d from thread %d.\n", i, omp_get_thread_num());
}
}
Then here is my whole program.
int main()
{
omp_set_num_threads(4);
#pragma omp parallel
{
for (int i = 0; i < 2; i++)
{
foo();
#pragma omp critical
printf("%d\n", i);
}
}
return 0;
}
When I compile and run this code (with g++ -std=c++17), I get the following output on the terminal:
Hello 0 from thread 0.
Hello 1 from thread 1.
Hello 2 from thread 2.
0
0
Hello 2 from thread 2.
Hello 1 from thread 1.
0
Hello 0 from thread 0.
0
1
1
1
1
i
is a private variable. I would expect that the function foo
would be run twice per thread. So I would expect to see eight "Hello from %d thread %d.\n" statements in the terminal, just like how I see eight numbers printed when printing i
. So what gives here? Why is it that in the same loop, OMP behaves so differently?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
来自
的文档Parallel
:强调我的。由于在
foo
中的是一个工作共享构建体,因此每个外迭代只能执行一次,无论多少线程在main中运行并行块
。From the documentation of
omp parallel
:Emphasis mine. Since the
omp for
infoo
is a work-sharing construct, it is only executed once per outer iteration, no matter how many threads run the parallel block inmain
.这是因为
#pragma op
是一个工作共享构造,因此它将在线程之间分配工作,并且所使用的线程数在这方面无关紧要,只有循环计数的数量() 2*3 = 6
)。如果使用
OMP_SET_NUM_THREADS(1);
您还会看到6个输出。如果您使用的线程多于循环计数,则某些线程将在内部循环中闲置,但是您仍然会看到6个输出。另一方面,如果您删除
#pragma op
行,您将看到(线程数)*2*3
(= 24)输出。It is because
#pragma omp for
is a worksharing construct, so it will distribute the work among threads and the number of threads used does not matter in this respect, just the number of loop counts (2*3=6
).If you use
omp_set_num_threads(1);
you also see 6 outputps. If you use more threads than loop counts, some threads will be idle in the inner loop, but you still see exactly 6 outputs.On the other hand, if you remove
#pragma omp for
line you will see(number of threads)*2*3
(=24) outputs.