为什么“ #pragma op parallel {#pragma op parallel for}”不同于“ #pragma op parallel”不同在执行时间？

发布于 2025-02-05 05:18:25 字数 1270 浏览 2 评论 0原文

以下a>

Known: number of processors: 28

代码1：

  void fun1（）
{
    printf（“你好，世界\ n”）;
}
#pragma op Parallel
{
    fun1（）;
}

代码2：

  void fun2（）
{
    ＃pragma for
    for（int i = 0; i＆lt; 10; i ++）
    {
        printf（“你好，世界\ n”）;
    }
}
#pragma op Parallel
{
    fun2（）;
}

代码3：

  #pragma op Parallel
{
    ＃pragma for
    for（int i = 0; i＆lt; 10; i ++）
    {
        printf（“你好，世界\ n”）;
    }
}

结果：
代码1：执行printf 28*1 = 28次。
code2等效于代码3：printf执行10次。 为什么不为什么 printf被执行28*10 = 280次，28个线程中的每一个负责整个循环？

ORIGINAL POST:

Question:

为什么
  #pragma op Parallel
{
    ＃pragma for
    for（int i = 0; i＆lt; n; i ++）{}
}
 
导致循环的每次迭代都被执行1次，为什么不
  #pragma op for
for（int i = 0; i＆lt; n; i ++）{}
 
（即上述{}中的代码）执行多次与线程数量（表示为M）根据“ #pragma Omp平行”的规格，即循环的每次迭代分别由M次执行线程？

或这种嵌套并行结构是“不能” 本质上由“ #pragma op parallel”的规格解释由于实现？

原文

Based on：enter link description here

Known: number of processors: 28

Code 1:

void fun1()
{
    printf("Hello, world\n");
}
#pragma omp parallel
{
    fun1();
}

Code 2:

void fun2()
{
    #pragma omp for
    for(int i=0;i<10;i++)
    {
        printf("Hello, world\n");
    }
}
#pragma omp parallel
{
    fun2();
}

Code 3:

#pragma omp parallel
{
    #pragma omp for
    for(int i=0;i<10;i++)
    {
        printf("Hello, world\n");
    }
}

Results:
Code1: printf is executed 28*1=28 times.
Code2 is equivalent to Code3: printf is executed 10 times. WHY？WHY NOT
printf is executed 28*10=280 times, with each of the 28 threads
responsible for the whole for-loop?

ORIGINAL POST:

Question:

Why
#pragma omp parallel
{
    #pragma omp for
    for(int i=0;i<N;i++){}
}
results in that every iteration of the loop is executed 1 time, and why not
#pragma omp for
for(int i=0;i<N;i++){}
(i.e. code within { } above) executed as many times as the numbers of threads(denoted as M)
according to the specifications of "#pragma omp parallel", namely
every iteration of the loop is respectively executed M times by M
threads?

or maybe this kind of nested parallel construct by "for" can't be
natively explained by the specifications of "#pragma omp parallel"
because of implementations ?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

沉溺在你眼里的海 2025-02-12 05:18:25

OpenMP中的两个基本概念是1。并行区域：如果您遇到op parallel a team 的线程将创建，并且每个线程开始执行该区域。和2。“工作共享结构”，其中的op是最明显的。如果您有一组线程，则将工作分布在这些线程上。因此，在您的两个代码中2＆amp; 3您创建一个团队，然后团队遇到循环并分发迭代。

您想知道为什么每个线程都不能执行整个循环？如果您省略的op，那将会发生。在那种情况下，循环是像其他任何人一样的指令，每个线程都会完整执行。

回复收藏 0 原文

香草可樂 2025-02-12 05:18:25

此代码：

#pragma omp for
for(int i=0;i<N;i++){}

实际上是顺序代码。根据在OpenMP规范中，构造的需要并行它绑定到的构造。并行 construct创建用于并行执行的。因此，您确实必须编写

#pragma omp parallel  // creates the threads
{
    #pragma omp for   // execute in parallel
    for(int i=0;i<N;i++){}
}

您也可以使用较短的表单：

#pragma omp parallel for   // create threads & execute in parallel
for(int i=0;i<N;i++){}

update （要反映对原始帖子的更新）：

原始帖子中的代码1在并行区域中运行28个线程，每个线程调用功能，并打印“ Hello World”。

代码2和代码3 Spawn 28线程。代码2调用函数，构造在28个线程上分发了10个循环迭代。由于只有10次迭代，因此只会发生10个printf的调用，只有10个线程将积极打印。其他18不会做任何事情。代码3相同。

我提供的链接说明了构造的。

This code:

#pragma omp for
for(int i=0;i<N;i++){}

is practically sequential code. As per the section Worksharing-Loop Construct in the OpenMP specification, the for construct needs a parallel construct that it binds to. The parallel construct creates the threads that the for uses to execute in parallel. So, you indeed have to write

#pragma omp parallel  // creates the threads
{
    #pragma omp for   // execute in parallel
    for(int i=0;i<N;i++){}
}

You can use the shorter form, too:

#pragma omp parallel for   // create threads & execute in parallel
for(int i=0;i<N;i++){}

UPDATE (to reflect the update to the original post):

Code 1 in the original post runs 28 threads in the parallel region, each calling the function, and printing "Hello World".

Code 2 and code 3 spawn 28 threads. Code 2 calls the function and the for construct distributes 10 loop iterations across 28 threads. Since there are only 10 iterations, only 10 invocations of printf will happen, and only 10 threads will actively print. The other 18 will do nothing. Same for Code 3.

The link I have provided explains what the for construct does.

回复收藏 0 原文

~没有更多了~