MPI 运行时系统分配的进程的物理位置
当我们在集群上使用命令 mpirun -np 4 a.out
启动 MPI 程序时,然后 MPI 运行时系统如何跨 CPU 分配进程?
我的意思是,假设它在集群中找到一个空闲的四核 CPU,它会在该 CPU 上运行所有 4 个进程,还是会找到 4 个 CPU 并运行 4 个进程,每个 CPU 1 个进程?
这是否取决于 MPI 的具体实现?
我应该对 MPI 为我选择的特定配置感到困扰吗(一个 CPU 上有 4 个进程,或者 4 个 CPU 上每个 CPU 有 1 个进程)
When we launch an MPI program with the command say mpirun -np 4 a.out
on a cluster, then
how does the MPI run-time system assign the processes across the CPU's?
What I mean is, suppose it finds an idle quad-core CPU in the cluster , will it run all the 4 processes on that CPU, or will it find 4 CPU's and run 4 processes with 1 process per CPU?
Does this depend on the particular implementation of MPI?
And should I be bothered by the particular configuration which MPI would pick for me (4 processes on one CPU or 1 process per CPU on 4 CPU's)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
是的,这取决于 MPI 的实现,而且这很重要。例如,如果您期望每个 MPI 任务能够使用一个节点的内存,但您发现自己在单个节点上加载了 4 个任务,而在其他节点上没有加载任何任务,那么您将遇到严重的问题。同样,如果您在 4 个 8 核节点上运行,并且运行 4 个 mpi 任务,每个任务有 8 个 OpenMP 线程,那么对于 4 个节点中的每个节点使用 1 个任务和 8 个线程,或者使用 4 个任务和 32 个线程,两者之间存在很大差异在一个节点上,而在其他节点上没有任何内容。
x86 类型硬件上最常见的 MPI 实现是基于 OpenMPI 或 MPICH2。 OpenMPI 会先填满一个节点,然后再转到下一个节点;您可以更改该行为,例如,为其提供“--bynode”选项,它将一个任务分配给一个节点,将下一个任务分配给下一个节点,等等,并根据需要再次环绕到第一个节点。 (OpenMPI 还具有 --bysocket 和 --bycore 来进行更精细的控制,以及非常有用的 --display-map 选项,它可以准确地显示发生了什么)。
对于基于 mpich2 的 MPI,您可以为其提供“循环”的 -rr 选项,这将在节点之间循环(例如,OpenMPI 的 --bynode 行为)。
无论哪种情况,在 Linux 类型的系统上,您始终可以运行例如“mpirun -np 4 hostname”作为一种快速而肮脏的方式来找出 mpirun 命令将在哪些主机上启动进程。
Yes, it depends on the MPI implementation, and yes it matters. For instance, if you were expecting to be able to use a nodes worth of memory per MPI task, and you find yourself loading 4 tasks on a single node and nothing on the others, you're going to run into serious problems. Similarly, if you are running on 4 8-core nodes, and you were running 4 mpi tasks with 8 OpenMP threads each, there's a big difference between using 1 task and 8 threads for each of the 4 nodes, or 4 tasks and 32 threads on one node and nothing on the others.
The most common MPI implementations out there on x86-type hardware are OpenMPI or MPICH2-based. OpenMPI will fill up a node before going to the next one; you can change that behavior with, for instance, giving it the "--bynode" option, where it will assign one task to one node, the next task to the next, etc, and wrapping around to the first node again as needed. (OpenMPI also has --bysocket and --bycore for finer control, and the very useful --display-map option which shows you exactly what's going where).
With mpich2-based MPIs, you can give it the -rr option for "round robin", which will round robin between nodes (eg, OpenMPI's --bynode behaviour).
In either case, on linux-type systems you can always run eg 'mpirun -np 4 hostname' as a quick and dirty way to find out which hosts your mpirun command would launch processes on.