OpenMP 代码在线程池中执行
我正在考虑一种设计,线程池将执行代码块,其中可能包含 OpenMP 语句(大多数情况下是并行的)。 (类似于:如何处理 OpenMP 线程池争用我猜)。 我的问题是,如果 OpenMP 并行区域每次都由不同的线程执行,是否会导致问题或导致性能不佳。
编辑:
目标将是 Linux (gcc) 和 Windows (msvc)。
当我的第一个原型完成时,我将对它进行基准测试(这将受到我在这里得到的答案的影响)。
这是一个简单的示例:
class Task
{
public:
void doTask()
{
#pragma omp parallel
{
// do work in parallel
}
}
};
现在假设您创建一个 Task
实例,并将其分配给线程池(thread-0,...,thread-n)。一个线程执行doTask()
。稍后,您再次将相同的 Task 对象放入线程池中,然后再次...。 因此,doTask()
(以及并行部分)将由不同的线程执行。我想知道 OpenMP 是否可以有效地处理这个问题(例如,该部分的线程不会每次都重新创建)。
I'm thinking about a design were a thread pool will execute code blocks, which may contain OpenMP statements (parallel for mostly).
(Similar to: How to deal with OpenMP thread pool contention I guess).
My question is if it will cause problems or lead to bad performance if an OpenMP parallel region is executed by a different thread everytime.
edit:
Target will be Linux (gcc) and Windows (msvc).
I will benchmark it, when my first prototype is done (which will be influenced by the answers I get here).
Here is a simple example:
class Task
{
public:
void doTask()
{
#pragma omp parallel
{
// do work in parallel
}
}
};
Now imagine you create an instance of Task
give it to a thread pool (thread-0, ..., thread-n). One thread executes doTask()
. Later you give the same Task object again into the thread pool, and again, ... .
So doTask()
(and the parallel section) will be executed by different threads. I wonder if this is handled by OpenMP efficiently (e.g. the threads for the section are not recreated every time).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
维托尔的评论是正确的。很难判断这是否会导致问题,因为答案取决于许多因素(即数据布局、访问数据的方式、缓存大小、运行的处理器类型以及列表)继续)。
我能说的是,你可能会也可能不会让它发挥作用。 OpenMP 规范 - 以及大多数其他线程模型 - 没有说明模型如何或是否可以“很好地协同工作”。例如,即使某些 OpenMP 实现使用 pthreads 进行底层实现,除非该实现已经完成了一些工作,否则用户无法直接调用 pthreads 库并使其与 OpenMP 一起工作。当前的一个例子是 gcc bug 42616(pthread 内的 OMP 循环导致崩溃)。另一个例子是英特尔,它的编译器支持许多并行模型,但努力让它们协同工作。由于您还没有说出您将使用什么编译器,所以我只能说在您承诺做一些大的事情之前尝试一个小示例代码,看看它是否有效。
我过去曾尝试过类似的事情。我使用了 pthreads,然后使用了 OpenMP 构造。我发现对于我的应用程序来说它工作得很好。当遇到 OpenMP 并行区域时,每个 pthread 都被视为初始线程。然后,OpenMP 运行时为该区域创建附加线程并运行该区域。由于大多数 OpenMP 实现不会销毁线程,而是将它们放入空闲池中以便在遇到另一个区域时重用,因此开销看起来不错 - 但随后我在该区域有很多工作要做。所以它可以起作用 - 但你必须小心。
Vitor's comment is correct. It is hard to tell whether or not this will cause problems, because the answer depends on many factors (i.e., the data layout, how you are accessing the data, the cache size, the type of processor you are running on, and the list goes on).
What I can say, is that you may or may not be able to get this to work. The OpenMP spec - as well as most of the other threading models - don't say anything about how or if the models will "play nicely together". For example, even though some OpenMP implementations use pthreads for the underlying implementation, unless the implementation has done some work, the user cannot directly call the pthreads library and get it to work together with OpenMP. A current example of this is gcc bug 42616 (OMP'ed loop inside pthread leads to crash). Another example is Intel, whose compiler supports many parallel models, but has tried hard to get them to work together. Since you haven't said what compiler you are going to use, all I can say is try a small sample code to see if it works before you commit to doing something large.
I have tried something like this in the past. I used pthreads that then used OpenMP constructs. What I found was that for my application it worked okay. Each pthread was considered an initial thread when the OpenMP parallel region was encountered. The OpenMP runtime then created the additional threads for the region and ran the region. Since most OpenMP implementations don't destroy the threads, but put them in a free pool to reuse when another region is encountered, the overhead seemed fine - but then I had a lot of work to do in the region. So it can work - but you have to be careful.