OpenMP:一次分配线程
假设您有一个循环,其中包含不同数量的迭代和4个核心,
我知道
#pragma op parallel
基本上会将迭代划分为这样的迭代,而大小/4个长度
| T1 | T2 | T3 | T4 |
但是,在我的特殊情况下,这种行为将更有利。每个块是尺寸/尺寸长度的位置。因此,线程1不会迭代0..ize/4,而是迭代0,size/4,2*size/4,3*size/4
| T1 | T2 | T3 | T3 | T4 | T4 | T1 | T2 | T2 | t3 | t4 | t1 | t2 | t3 | t4 | t1 | t2 | t3 | t4 |
当直到运行时才知道迭代数时,我如何执行我的代码?
Say you have a loop containing a varying number of iterations and 4 cores
I understand that
#pragma omp parallel for
will basically divide the iterations in like this with chunks of size/4 length
| T1 | T2 | T3 | T4 |
However, in my particular situation, this behavior would be more advantageous. Where each chunk is size/size length. So thread 1 would not get iterations 0..size/4, but instead iterations 0,size/4,2*size/4,3*size/4
|T1|T2|T3|T4|T1|T2|T3|T4|T1|T2|T3|T4|T1|T2|T3|T4|
How can I have my code execute like this when the number of iterations is not known until runtime?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您所描述的内容 - 假设您的启发式是大小/总线程 - 是圆形旋转计划( ie,静态调度),使用chunk_size = 1
。 ,如果运行时已知(或不知道)迭代的数量,则没有什么区别。
What you are describing -- assuming that your heuristic is size/total threads -- is a round-robin scheduling (i.e., static scheduling) with chunk_size = 1. For that you simply need :
In this case, it makes no difference if the number of iterations is known (or not) at runtime.