为什么“ #pragma op parallel {#pragma op parallel for}”不同于“ #pragma op parallel”不同在执行时间?
以下a>
Known: number of processors: 28
代码1:
void fun1() { printf(“你好,世界\ n”); } #pragma op Parallel { fun1(); }
代码2:
void fun2() { #pragma for for(int i = 0; i< 10; i ++) { printf(“你好,世界\ n”); } } #pragma op Parallel { fun2(); }
代码3:
#pragma op Parallel { #pragma for for(int i = 0; i< 10; i ++) { printf(“你好,世界\ n”); } }
结果:
代码1:执行printf 28*1 = 28次。
code2等效于代码3:printf执行10次。 为什么不为什么 printf被执行28*10 = 280次,28个线程中的每一个 负责整个循环?
ORIGINAL POST:
Question:
为什么
#pragma op Parallel { #pragma for for(int i = 0; i< n; i ++){} }
导致循环的每次迭代都被执行1次,为什么不
#pragma op for for(int i = 0; i< n; i ++){}
(即上述{}中的代码)执行多次与线程数量(表示为M) 根据“ #pragma Omp平行”的规格,即 循环的每次迭代分别由M次执行 线程?
或这种嵌套并行结构是“不能” 本质上由“ #pragma op parallel”的规格解释 由于实现?
Based on:enter link description here
Known: number of processors: 28
Code 1:
void fun1() { printf("Hello, world\n"); } #pragma omp parallel { fun1(); }
Code 2:
void fun2() { #pragma omp for for(int i=0;i<10;i++) { printf("Hello, world\n"); } } #pragma omp parallel { fun2(); }
Code 3:
#pragma omp parallel { #pragma omp for for(int i=0;i<10;i++) { printf("Hello, world\n"); } }
Results:
Code1: printf is executed 28*1=28 times.
Code2 is equivalent to Code3: printf is executed 10 times. WHY?WHY NOT
printf is executed 28*10=280 times, with each of the 28 threads
responsible for the whole for-loop?
ORIGINAL POST:
Question:
Why
#pragma omp parallel { #pragma omp for for(int i=0;i<N;i++){} }
results in that every iteration of the loop is executed 1 time, and why not
#pragma omp for for(int i=0;i<N;i++){}
(i.e. code within { } above) executed as many times as the numbers of threads(denoted as M)
according to the specifications of "#pragma omp parallel", namely
every iteration of the loop is respectively executed M times by M
threads?
or maybe this kind of nested parallel construct by "for" can't be
natively explained by the specifications of "#pragma omp parallel"
because of implementations ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
OpenMP中的两个基本概念是1。并行区域:如果您遇到
op parallel
a team 的线程将创建,并且每个线程开始执行该区域。和2。“工作共享结构”,其中的op是最明显的。如果您有一组线程,则将工作分布在这些线程上。因此,在您的两个代码中2&amp; 3您创建一个团队,然后团队遇到循环并分发迭代。
您想知道为什么每个线程都不能执行整个循环?如果您省略 的
op,那将会发生。在那种情况下,循环是像其他任何人一样的指令,每个线程都会完整执行。
The two basic concepts in OpenMP are 1. the parallel region: if you encounter
omp parallel
a team of threads is created, and each thread starts executing the region. And 2. "worksharing constructs", of whichomp for
is the most obvious one. If you have a team of threads, the work is distributed over those threads. So in both your codes 2 & 3 you create a team, and then the team encounters the loop and distributes the iterations.You are wondering why not every thread executes the whole loop? That would happen if you omit the
omp for
. In that case the loop is an instruction like any other, and each thread executes it in its entirety.此代码:
实际上是顺序代码。根据在OpenMP规范中,构造的
需要
用于并行执行的并行
它绑定到的构造。并行
construct创建。因此,您确实必须编写
您也可以使用较短的表单:
update (要反映对原始帖子的更新):
原始帖子中的代码1在并行区域中运行28个线程,每个线程调用功能,并打印“ Hello World”。
代码2和代码3 Spawn 28线程。代码2调用函数,构造在28个线程上分发了10个循环迭代。由于只有10次迭代,因此只会发生10个
printf
的调用,只有10个线程将积极打印。其他18不会做任何事情。代码3相同。我提供的链接说明了构造的
。
This code:
is practically sequential code. As per the section Worksharing-Loop Construct in the OpenMP specification, the
for
construct needs aparallel
construct that it binds to. Theparallel
construct creates the threads that thefor
uses to execute in parallel. So, you indeed have to writeYou can use the shorter form, too:
UPDATE (to reflect the update to the original post):
Code 1 in the original post runs 28 threads in the parallel region, each calling the function, and printing "Hello World".
Code 2 and code 3 spawn 28 threads. Code 2 calls the function and the
for
construct distributes 10 loop iterations across 28 threads. Since there are only 10 iterations, only 10 invocations ofprintf
will happen, and only 10 threads will actively print. The other 18 will do nothing. Same for Code 3.The link I have provided explains what the
for
construct does.