使用 OpenMP 并行化递归的基本情况计算

发布于 2024-10-17 18:35:05 字数 772 浏览 3 评论 0原文

我正在尝试学习 OpenMP 的概念,并偶然发现了一个案例,我很难掌握如何使用该库来解决该问题。

假设我们有以下递归函数,

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{
    int tmp[10];
    recurse(tmp, 0, 10);
    return 0;
}

如何在使用 OpenMP 在主线程中生成新结构的同时并行执行performTask

我知道有一种叫做“任务”的东西,我认为这就是我应该在这里使用的东西,但我想出的所有东西根本没有获得任何性能提升。请指出我正确的方向。

编辑:我使示例程序更加具体,以便更好地解释。

I'm trying to learn the concepts OpenMP and stumbled upon a case which I'm having a hard time grasping on how to solve using this library.

Let's say we have the following recursion function

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{
    int tmp[10];
    recurse(tmp, 0, 10);
    return 0;
}

How can I execute performTask in parallel while generating new structs in the master thread using OpenMP?

I know there is something called 'tasks', and I think that's what I'm supposed to be using here, but everything I come up with just doesn't get any performance gains at all. Please point me in the right direction.

Edit: I made the example program more concrete for better explanation.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

淑女气质 2024-10-24 18:35:05

下面的代码不能按原样工作,但希望它能为您指明正确的方向:

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      // perform task using the thread pool
#pragma omp task     
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{    
    int tmp[10];
    // start threads
#pragma omp parallel
{
    // use single thread to construct `tmp` values
#pragma omp single nowait
    recurse(tmp, 0, 10);
}
    return 0;
}

该代码基于 比较 OpenMP 3.0 中的嵌套并行区域和任务分配

The code below doesn't work as is, but hopefully it will point you in the right direction:

// ...
void recurse(int tmp[], int p, const int size)
{
   if (p == size)
   {
      // Computationally heavy, should be executed in its own "thread"
      // perform task using the thread pool
#pragma omp task     
      performTask(tmp); // Note: Only requires read access
   }
   else
   {
      for(int i = 0; i < size; i++)
      {
         // Alter tmp and continue recursion
         tmp[p] = i;
         recurse(tmp, p+1, size);
      }
   }
}
// ...
int main(int argc, char * argv[])
{    
    int tmp[10];
    // start threads
#pragma omp parallel
{
    // use single thread to construct `tmp` values
#pragma omp single nowait
    recurse(tmp, 0, 10);
}
    return 0;
}

The code is based on Comparing Nested Parallel Regions and Tasking in OpenMP 3.0.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文