OpenMPI MPI_Barrier 问题
我在使用 MPI_Barrier 的 OpenMPI 实现时遇到一些同步问题:
int rank;
int nprocs;
int rc = MPI_Init(&argc, &argv);
if(rc != MPI_SUCCESS) {
fprintf(stderr, "Unable to set up MPI");
MPI_Abort(MPI_COMM_WORLD, rc);
}
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
printf("P%d\n", rank);
fflush(stdout);
MPI_Barrier(MPI_COMM_WORLD);
printf("P%d again\n", rank);
MPI_Finalize();
对于 mpirun -n 2 ./a.out
输出应该是: P0 P1 ...
输出有时: P0 再次P0 P1 又P1了,
怎么回事?
I having some synchronization issues using the OpenMPI implementation of MPI_Barrier:
int rank;
int nprocs;
int rc = MPI_Init(&argc, &argv);
if(rc != MPI_SUCCESS) {
fprintf(stderr, "Unable to set up MPI");
MPI_Abort(MPI_COMM_WORLD, rc);
}
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
printf("P%d\n", rank);
fflush(stdout);
MPI_Barrier(MPI_COMM_WORLD);
printf("P%d again\n", rank);
MPI_Finalize();
for mpirun -n 2 ./a.out
output should be:
P0
P1
...
output is sometimes:
P0
P0 again
P1
P1 again
what's going on?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
打印输出行在终端上出现的顺序不一定是打印内容的顺序。您为此使用共享资源(
stdout
),因此总是存在排序问题。 (并且fflush
在这里没有帮助,stdout
无论如何都是行缓冲的。)您可以尝试在输出中添加时间戳前缀并将所有这些保存到不同的文件中,一个每个 MPI 进程。
然后,要检查日志,您可以将两个文件合并在一起并根据时间戳进行排序。
那么你的问题应该消失了。
The order in which your print out lines appear on your terminal is not necessarily the order in which things are printed. You are using a shared resource (
stdout
) for that so there always must be an ordering problem. (Andfflush
doesn't help here,stdout
is line buffered anyhow.)You could try to prefix your output with a timestamp and save all of this to different files, one per MPI process.
Then to inspect your log you could merge the two files together and sort according to the timestamp.
Your problem should disappear, then.
MPI_Barrier() 没有任何问题。
正如 Jens 提到的,您没有看到预期输出的原因是因为标准输出在每个进程上进行缓冲。不保证来自多个进程的打印将按顺序显示在调用进程上。 (如果将每个进程的标准输出传输到主进程进行实时打印,这将导致大量不必要的通信!)
如果您想说服自己屏障有效,您可以尝试写入文件。让多个进程写入单个文件可能会导致额外的复杂性,因此您可以让每个进程写入一个文件,然后在屏障之后交换它们写入的文件。例如:
示例实现:
运行代码后,您应该得到以下结果:
对于所有文件,“after Barrier”语句始终会稍后出现。
There is nothing wrong with MPI_Barrier().
As Jens mentioned, the reason why you are not seeing the output you expected is because stdout is buffered on each processes. There is no guarantee that prints from multiple processes will be displayed on the calling process in order. (If stdout from each process is be transferred to the main process for printing in real time, that will lead to lots of unnecessary communication!)
If you want to convince yourself that the barrier works, you could try writing to a file instead. Having multiple processes writing to a single file may lead to extra complications, so you could have each proc writing to one file, then after the barrier, swap the files they write to. For example:
Sample implementation:
After running the code, you should get the following results:
For all files, the "after Barrier" statements will always appear later.
MPI 程序中不保证输出顺序。
这与 MPI_Barrier 完全无关。
另外,我不会花太多时间担心 MPI 程序的输出排序。
如果您确实想要的话,实现此目的的最优雅的方法是让进程将其消息发送到一个等级,例如等级 0,并让等级 0 按照接收消息的顺序或按等级排序打印输出。
再次强调,不要花太多时间尝试对 MPI 程序的输出进行排序。不实用,用处不大。
Output ordering is not guaranteed in MPI programs.
This is not related to MPI_Barrier at all.
Also, I would not spend too much time on worrying about output ordering with MPI programs.
The most elegant way to achieve this, if you really want to, is to let the processes send their messages to one rank, say, rank 0, and let rank 0 print the output in the order it received them or ordered by ranks.
Again, dont spend too much time on trying to order the output from MPI programs. It is not practical and is of little use.
添加到此处的先前答案,您的 MPI_BARRIER 工作正常。
不过,如果您只是想看看它的工作原理,您可以强制暂停执行 (
SLEEP(1)
) 一会儿,让输出跟上。Adding to the previous answers here, your MPI_BARRIER works fine.
Though, if you just intend to see it working, you can force pause the execution (
SLEEP(1)
) for a moment to let the output catch up.