C、OpenMP:如何使三重循环的并行化更好?
我正在尝试使用 OpenMP 并行化 Floyd-Warshall 算法(基本上就地编辑 2D 数组),但我怀疑我是否会以最好的方式进行处理,这是我到目前为止所得到的:
#pragma omp parallel for private(i, j, k) shared(g)
for ( i = 0; i < n; i++ ) {
for ( j = 0; j < n; j++ ) {
for ( k = 0; k < n; k++ ) {
g->A[j][k] = imin( g->A[j][k], g->A[j][i] + g->A[i][k] );
}
}
}
有什么想法可以更好地利用 OpenMP?目前,运行时间只是减半,当然可以改进。
另外,如果有人对用于并行化的其他技术有任何建议,我会洗耳恭听。我考虑过 MPI,但我必须使整个 main
函数并行,对吗?
谢谢。
编辑
上面的代码不起作用,下面的答案说明了原因。
I'm trying to parallelise the Floyd-Warshall algorithm using OpenMP (basically editing a 2D array in-place), but I doubt that I'm going about it in the best way, here is what I've got so far:
#pragma omp parallel for private(i, j, k) shared(g)
for ( i = 0; i < n; i++ ) {
for ( j = 0; j < n; j++ ) {
for ( k = 0; k < n; k++ ) {
g->A[j][k] = imin( g->A[j][k], g->A[j][i] + g->A[i][k] );
}
}
}
Any Ideas how I can utilise OpenMP better? At the moment that just halves runtime, surely that can be improved.
Also if anyone as any suggestions for other technologies to be used for the parallelisation, I'm all ears. I thought about MPI, but I'd have to make my whole main
function parallel right?
Thanks.
EDIT
The code above does not work, the answers below show why.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
并行化算法并不简单。请参阅此处的注释
http://www.mcs.anl.gov/~itf/dbpp /text/node35.html
有关并行运行它的信息。如果您拥有少量处理器(双核、四核、八核机器),那么 Parallel Floyd 1 可能适合您。如果您有大量处理器(非常棒的 GPU、网格计算机),那么 Parallel Floyd 2 可能会更好。
Parallelizing the algorithm is not straightforward. See the notes here
http://www.mcs.anl.gov/~itf/dbpp/text/node35.html
for information on running it in parallel. If you have a small number of processors (dual-, quad-, oct- core machine) then Parallel Floyd 1 is probably right for you. If you have a huge number of processors (really awesome GPU, mesh computer) then Parallel Floyd 2 might be better.