使用 OpenMP 并行化递归的基本情况计算

发布于 2024-10-17 18:35:05 字数 772 浏览 3 评论 0原文

我正在尝试学习 OpenMP 的概念，并偶然发现了一个案例，我很难掌握如何使用该库来解决该问题。

假设我们有以下递归函数，

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{
    int tmp[10];
    recurse(tmp, 0, 10);
    return 0;
}

如何在使用 OpenMP 在主线程中生成新结构的同时并行执行performTask？

我知道有一种叫做“任务”的东西，我认为这就是我应该在这里使用的东西，但我想出的所有东西根本没有获得任何性能提升。请指出我正确的方向。

编辑：我使示例程序更加具体，以便更好地解释。

原文

I'm trying to learn the concepts OpenMP and stumbled upon a case which I'm having a hard time grasping on how to solve using this library.

Let's say we have the following recursion function

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{
    int tmp[10];
    recurse(tmp, 0, 10);
    return 0;
}

How can I execute performTask in parallel while generating new structs in the master thread using OpenMP?

I know there is something called 'tasks', and I think that's what I'm supposed to be using here, but everything I come up with just doesn't get any performance gains at all. Please point me in the right direction.

Edit: I made the example program more concrete for better explanation.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

淑女气质 2024-10-24 18:35:05

下面的代码不能按原样工作，但希望它能为您指明正确的方向：

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      // perform task using the thread pool
#pragma omp task     
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{    
    int tmp[10];
    // start threads
#pragma omp parallel
{
    // use single thread to construct `tmp` values
#pragma omp single nowait
    recurse(tmp, 0, 10);
}
    return 0;
}

该代码基于比较 OpenMP 3.0 中的嵌套并行区域和任务分配。

The code below doesn't work as is, but hopefully it will point you in the right direction:

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      // perform task using the thread pool
#pragma omp task     
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{    
    int tmp[10];
    // start threads
#pragma omp parallel
{
    // use single thread to construct `tmp` values
#pragma omp single nowait
    recurse(tmp, 0, 10);
}
    return 0;
}

The code is based on Comparing Nested Parallel Regions and Tasking in OpenMP 3.0.

回复收藏 0 原文

~没有更多了~