C++ 并行化库:OpenMP 与线程构建块

发布于 2024-07-14 18:21:02 字数 274 浏览 7 评论 0原文

我将改进我的自定义图形引擎,以便它利用多核 CPU。 更准确地说,我正在寻找一个并行循环的库。

在我看来,OpenMP 和 Intel 的线程构建模块都非常适合这项工作。 此外,Visual Studio 的 C++ 编译器和大多数其他流行的编译器都支持两者。 这两个库看起来都非常简单易用。

那么,我应该选择哪一个呢? 有没有人尝试过这两个库,并且可以告诉我使用这两个库的一些缺点和优点? 另外,你最后选择做什么工作?

谢谢,

阿德里安

I'm going to retrofit my custom graphics engine so that it takes advantage of multicore CPUs. More exactly, I am looking for a library to parallelize loops.

It seems to me that both OpenMP and Intel's Thread Building Blocks are very well suited for the job. Also, both are supported by Visual Studio's C++ compiler and most other popular compilers. And both libraries seem quite straight-forward to use.

So, which one should I choose? Has anyone tried both libraries and can give me some cons and pros of using either library? Also, what did you choose to work with in the end?

Thanks,

Adrian

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

贩梦商人 2024-07-21 18:21:02

我没有广泛使用 TBB,但我的印象是它们是互补而不是竞争。 TBB 提供了线程安全容器和一些并行算法,而 OpenMP 更多的是一种并行化现有代码的方式。

就我个人而言,我发现 OpenMP 很容易放入现有代码中,其中您有一个可并行循环或一堆可以并行运行的部分。 但是,它对您没有帮助,特别是在您需要修改某些共享数据的情况下 - TBB 的并发容器可能正是您想要的。

如果您想要的只是并行化循环,其中迭代是独立的(或者可以相当容易地实现),那么我会选择 OpenMP。 如果您需要线程之间更多的交互,我认为 TBB 在这方面可能会提供更多的功能。

I haven't used TBB extensively, but my impression is that they complement each other more than competing. TBB provides threadsafe containers and some parallel algorithms, whereas OpenMP is more of a way to parallelise existing code.

Personally I've found OpenMP very easy to drop into existing code where you have a parallelisable loop or bunch of sections that can be run in parallel. However it doesn't help you particularly for a case where you need to modify some shared data - where TBB's concurrent containers might be exactly what you want.

If all you want is to parallelise loops where the iterations are independent (or can be fairly easily made so), I'd go for OpenMP. If you're going to need more interaction between the threads, I think TBB may offer a little more in that regard.

清晨说晚安 2024-07-21 18:21:02

来自英特尔的软件博客:比较用于并行编程的 Windows* 线程、OpenMP*、英特尔® 线程构建模块

这也是风格问题 - 对我来说 TBB 非常像 C++,而我不喜欢不太喜欢 OpenMP 编译指示(有点 C 的味道,如果我必须用 C 编写,我会使用它)。

我还会考虑团队现有的知识和经验。 学习一个新的库(特别是在线程/并发方面)确实需要一些时间。 我认为目前 OpenMP 比 TBB 更广为人知和部署(但这只是我的观点)。

还有另一个因素 - 但考虑到最常见的平台,可能不是问题 - 可移植性。 但许可证可能是一个问题。

  • TBB 融合了一些源自学术研究的优秀研究,例如 递归数据并行方法
  • 有一些关于缓存友好性的工作,例如 示例
  • Intel博客的讲座看起来真的很有趣。

From Intel's software blog: Compare Windows* threads, OpenMP*, Intel® Threading Building Blocks for parallel programming

It is also the matter of style - for me TBB is very C++ like, while I don't like OpenMP pragmas that much (reeks of C a bit, would use it if I had to write in C).

I would also consider the existing knowledge and experience of the team. Learning a new library (especially when it comes to threading/concurrency) does take some time. I think that for now, OpenMP is more widely known and deployed than TBB (but this is just mine opinion).

Yet another factor - but considering most common platforms, probably not an issue - portability. But the license might be an issue.

  • TBB incorporates some of nice research originating from academic research, for example recursive data parallel approach.
  • There is some work on cache-friendliness, for example.
  • Lecture of the Intel blog seems really interesting.
海未深 2024-07-21 18:21:02

总的来说,我发现使用 TBB 需要对代码库进行更耗时的更改,并且回报较高,而 OpenMP 则提供快速但中等的回报。 如果您正在从头开始一个新模块并进行长期考虑,请选择 TBB。 如果您想要小而立竿见影的收益,请选择 OpenMP。

此外,TBB 和 OpenMP 并不相互排斥。

In general I have found that using TBB requires much more time consuming changes to the code base with a high payoff while OpenMP gives a quick but moderate payoff. If you are staring a new module from scratch and thinking long term go with TBB. If you want small but immediate gains go with OpenMP.

Also, TBB and OpenMP are not mutually exclusive.

眼眸里的那抹悲凉 2024-07-21 18:21:02

我实际上使用过这两种方法,我的总体印象是,如果您的算法相当容易实现并行(例如,大小均匀的循环,没有太多的数据相互依赖性),OpenMP 会更容易,并且使用起来非常好。 事实上,如果您发现可以使用 OpenMP,而且您知道您的平台将支持它,那么这可能是更好的选择。 我没有使用 OpenMP 的新任务结构,它比原来的循环和部分选项更通用。

TBB 预先为您提供了更多数据结构,但肯定需要更多预先数据结构。 作为一个优点,它可能可以更好地让您意识到竞争条件错误。 我的意思是,在 OpenMP 中,通过不共享(或其他)应该共享的内容来启用竞争条件是相当容易的。 只有当你得到不好的结果时你才会看到这一点。 我认为 TBB 发生这种情况的可能性较小。

总的来说,我个人更喜欢 OpenMP,特别是考虑到它增强了任务的表现力。

I've actually used both, and my general impression is that if your algorithm is fairly easy to make parallel (e.g. loops of even size, not too much data interdependence) OpenMP is easier, and quite nice to work with. In fact, if you find you can use OpenMP, it's probably the better way to go, if you know your platform will support it. I haven't used OpenMP's new Task structures, which are much more general than the original loop and section options.

TBB gives you more data structures up front, but definitely requires more up front. As a plus, it might be better at making you aware of race condition bugs. What I mean by this is that it is fairly easy in OpenMP to enable race conditions by not making something shared (or whatever) that should be. You only see this when you get bad results. I think this is a bit less likely to occur with TBB.

Overall my personal preference was for OpenMP, especially given its increased expressiveness with tasks.

圈圈圆圆圈圈 2024-07-21 18:21:02

据我所知,TBB(有一个可用的 GPLv2 下的开源版本)更多地针对 C++,而不是 C 领域。 现在很难找到 C++ 和一般 OOP 并行化的特定信息。大多数地址都是函数式的东西,比如 c(在 CUDA 或 OpenCL 上也是如此)。 如果您需要 C++ 支持并行化,请选择 TBB!

As far as i know, TBB (there is an OpenSource Version under GPLv2 avaiable) adresses more the C++ then C Area. These times it's hard to find C++ and general OOP parallelization specific Informations.The most adresses functional stuff like c (the same on CUDA or OpenCL). If you need C++ Support for parallelization go for TBB!

你げ笑在眉眼 2024-07-21 18:21:02

是的,TBB 对 C++ 更友好,而 OpenMP 考虑到其设计,更适合 FORTRAN 风格的 C 代码。 OpenMP 中的新任务功能看起来非常有趣,同时 C++0x 中的 Lambda 和函数对象可能会让 TBB 更易于使用。

Yes, TBB is much more C++ friendly while OpenMP is more appropriate for FORTRAN-style C code given its design. The new task feature in OpenMP looks very interesting, while at the same time the Lambda and function object in C++0x may make TBB easier to use.

眉目亦如画i 2024-07-21 18:21:02

在 Visual Studio 2008 中,您可以添加以下行来并行化任何“for”循环。 它甚至可以与多个嵌套的 for 循环一起使用。 下面是一个示例:

#pragma omp parallel for private(i,j)
for (i=0; i<num_particles; i++)
{
  p[i].fitness = fitnessFunction(p[i].present);
  if (p[i].fitness > p[i].pbestFitness)
  { 
     p[i].pbestFitness = p[i].fitness;
     for (j=0; j<p[i].numVars; j++) p[i].pbest[j] = p[i].present[j];
  }
}  
gbest = pso_get_best(num_particles, p);

在我们添加 #pragma omp parallel 后,我的 Core 2 Duo 上的两个内核均已达到最大容量,因此总 CPU 使用率从 50% 变为 100%。

In Visual Studio 2008, you can add the following line to parallelize any "for" loop. It even works with multiple nested for loops. Here is an example:

#pragma omp parallel for private(i,j)
for (i=0; i<num_particles; i++)
{
  p[i].fitness = fitnessFunction(p[i].present);
  if (p[i].fitness > p[i].pbestFitness)
  { 
     p[i].pbestFitness = p[i].fitness;
     for (j=0; j<p[i].numVars; j++) p[i].pbest[j] = p[i].present[j];
  }
}  
gbest = pso_get_best(num_particles, p);

After we added the #pragma omp parallel, both cores on my Core 2 Duo were used to their maximum capacity, so total CPU usage went from 50% to 100%.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文