MPICH 与 OpenMPI
有人可以详细说明 MPI 的 OpenMPI 和 MPICH 实现之间的差异吗? 两者中哪一个是更好的实现?
Can someone elaborate the differences between the OpenMPI and MPICH implementations of MPI ?
Which of the two is a better implementation ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
目的
首先,重要的是要认识到 MPICH 和 Open-MPI 的不同之处,即它们旨在满足不同的需求。 MPICH 被认为是最新 MPI 标准的高质量参考实现,也是满足特殊用途需求的衍生实现的基础。 Open-MPI 针对使用和网络管道方面的常见情况。
对网络技术
的支持 Open-MPI 在此处。 MPICH 在随每个版本分发的自述文件中列出了此信息(例如 这适用于 3.2.1)。请注意,由于 Open-MPI 和 MPICH 都支持 OFI (又名 libfabric)网络层,因此它们支持许多属于同一网络。然而,libfabric 是一个多方面的 API,因此并非每个网络都可以在两者中得到相同的支持(例如 MPICH 有一个基于 OFI 的 IBM Blue Gene/Q 实现,但我不知道 Open-MPI 中是否有同等支持) 。然而,基于 OFI 的 MPICH 和 Open-MPI 实现均适用于共享内存、以太网(通过 TCP/IP)、Mellanox InfiniBand、Intel Omni Path 以及其他可能的网络。 Open-MPI 还原生支持这些网络和其他网络(即中间没有 OFI)。
过去,人们普遍抱怨 MPICH 不支持 InfiniBand,而 Open-MPI 支持。然而,MVAPICH 和 Intel MPI(以及其他)——两者都是 MPICH 的衍生品——都支持 InfiniBand,因此如果愿意将 MPICH 定义为“MPICH 及其衍生品”,那么 MPICH 拥有极其广泛的网络支持,包括 InfiniBand 和专有的网络支持。互连,如 Cray Seastar、Gemini 和 Aries 以及 IBM Blue Gene(/L、/P 和 /Q)。 Open-MPI 还支持 Cray Gemini 互连,但 Cray 不支持其使用。最近,MPICH 通过 netmod(现已弃用)支持 InfiniBand,但 MVAPICH2 进行了广泛的优化,使其成为几乎所有情况下的首选实现。
最新 MPI 标准的功能支持
硬件/平台支持的正交轴是 MPI 标准的覆盖范围。在这方面,MPICH 通常要优越得多。 MPICH 是 MPI 标准每个版本(从 MPI-1 到 MPI-3)的第一个实现。 Open-MPI 最近才支持 MPI-3,我发现某些 MPI-3 功能在某些平台上存在错误(当然,MPICH 并非没有错误,但 MPI-3 功能中的错误并不常见)。
从历史上看,Open-MPI 并未对 MPI_THREAD_MULTIPLE 提供全面支持,而这对于某些应用程序至关重要。它可能在某些平台上受支持,但通常不能假设它可以工作。另一方面,MPICH 多年来一直对
MPI_THREAD_MULTIPLE
提供全面支持,尽管实施并不总是高性能(请参阅 “多线程 MPI 实现中的锁定方面” 进行一项分析)。Open-MPI 1.x 中被破坏的另一个功能是单方面通信,又名 RMA。这个问题最近已得到修复,作为这些功能的重度用户,我发现它们通常在 Open-MPI 3.x 中运行良好(请参阅 Travis CI 中的 ARMCI-MPI 测试矩阵 的结果显示 RMA 可以与两种实现配合使用,至少在共享内存中。我见过类似的在 Intel Omni Path 上取得了积极的结果,但尚未测试 Mellanox InfiniBand
进程管理
Open-MPI 曾经明显优越的一个领域是进程管理器,但幸运的是,它很脆弱且难以使用。已弃用多年(请参阅 MPICH FAQ 条目 了解详细信息)因此,由于 MPD 而对 MPICH 的批评是虚假的
。 ,例如,两者都支持 HWLOC 以控制进程拓扑。有报告称,对于较大的作业(1000 多个进程),Open-MPI 进程启动速度比 MPICH 衍生程序更快,但由于我在这里没有第一手经验,因此我不愿意发表任何结论。此类性能问题通常是特定于网络的,有时甚至是特定于机器的。
我发现 Open-MPI 在使用 MacOS 和 VPN 时更加稳健,即 MPICH 可能会由于主机名解析问题而在启动时挂起。由于这是一个错误,因此这个问题将来可能会消失。
二进制可移植性
虽然 MPICH 和 Open-MPI 都是开源软件,可以在各种平台上编译,但二进制形式的 MPI 库或与其链接的程序的可移植性通常很重要。
MPICH 及其许多衍生产品支持 ABI 兼容性(网站),这意味着库的二进制接口是常量,因此可以使用一个实现中的
mpi.h
进行编译,然后使用另一个实现运行。即使在多个版本的库中也是如此。例如,我经常编译 Intel MPI,但在运行时LD_PRELOAD
编译 MPICH 的开发版本。 ABI 兼容性的一大优势是 ISV(独立软件供应商)可以发布仅针对 MPICH 家族的一个成员编译的二进制文件。ABI 不是唯一的二进制兼容性类型。上述场景假设用户在各处使用相同版本的 MPI 启动器(通常是
mpirun
或mpiexec
及其计算节点守护进程)和 MPI 库。对于容器来说,情况不一定如此。虽然 Open-MPI 不承诺 ABI 兼容性,但他们在支持容器方面投入了大量资金(docs,< a href="https://www.intel.com/content/dam/www/public/us/en/documents/presentation/hpc-containers-singularity-advanced.pdf" rel="noreferrer">幻灯片)。这需要非常小心地维护不同版本的 MPI 启动器、启动器守护程序和 MPI 库的兼容性,因为用户可能使用比容器支持中的启动器守护程序更新版本的 MPI 启动器来启动作业。如果不仔细注意启动器界面的稳定性,除非启动器的每个组件的版本兼容,否则容器作业将不会启动。这并不是一个无法克服的问题:
感谢 Open-MPI 团队的 Ralph Casttain 向我解释了容器问题。前面的引言是他的。
特定于平台的比较
以下是我对各个平台的评估:
Mac OS:Open-MPI 和 MPICH 都应该可以正常工作。要获得 MPI-3 标准的最新功能,您需要使用最新版本的 Open-MPI(可从 Homebrew 获取)。如果您在 Mac 笔记本电脑上运行,则无需考虑 MPI 性能。
具有共享内存的 Linux:Open-MPI 和 MPICH 都应该可以正常工作。如果您想要一个支持所有 MPI-3 或 MPI_THREAD_MULTIPLE 的发行版本,您可能需要 MPICH,除非您自己构建 Open-MPI,因为例如 Ubuntu 16.04 仅通过 APT 提供古老的版本 1.10。我不知道这两种实现之间有任何显着的性能差异。如果操作系统允许,两者都支持单副本优化。
采用 Mellanox InfiniBand 的 Linux:使用 Open-MPI 或 MVAPICH2。如果您想要一个支持所有 MPI-3 或
MPI_THREAD_MULTIPLE
的发行版本,您可能需要 MAPICH2。我发现 MVAPICH2 的性能非常好,但尚未与 InfiniBand 上的 OpenMPI 进行直接比较,部分原因是性能对我来说最重要的功能(RMA 又名片面)过去在 Open-MPI 中已被破坏。采用 Intel Omni Path(或其前身 True Scale)的 Linux:我在此类系统上使用了 MVAPICH2、Intel MPI、MPICH 和 Open-MPI,并且所有系统都正常工作。 Intel MPI 往往是最优化的,而 Open-MPI 提供了开源实现的最佳性能,因为它们具有经过良好优化的 基于 PSM2 的后端。我在 GitHub 上有一些关于如何构建不同开源实现的注释,但是这些信息很快就会过时。
Cray 或 IBM 超级计算机:MPI 自动安装在这些计算机上,并且在这两种情况下都基于 MPICH。在 Cray XC40(此处)上使用 OFI,Cray XC40 上的英特尔 MPI (此处)使用 OFI,MPICH 在 Blue Gene/Q 上使用 OFI(此处),以及使用 OFI 和 uGNI 的 Cray XC40 上的 Open-MPI(此处),但这些均不受供应商支持。
Windows:我认为除了通过 Linux VM 之外在 Windows 上运行 MPI 没有任何意义,但 Microsoft MPI 和 Intel MPI 都支持 Windows 并且基于 MPICH。我听说过使用 适用于 Linux 的 Windows 子系统 成功构建 MPICH 或 Open-MPI 的报告,但没有个人经验。
备注
坦白说,我目前在英特尔从事研究/探路工作(即我不从事任何英特尔软件产品的工作),之前在阿贡国家实验室工作了五年,在那里我与 MPICH 团队进行了广泛的合作。
Purpose
First, it is important to recognize how MPICH and Open-MPI are different, i.e. that they are designed to meet different needs. MPICH is supposed to be high-quality reference implementation of the latest MPI standard and the basis for derivative implementations to meet special purpose needs. Open-MPI targets the common case, both in terms of usage and network conduits.
Support for Network Technology
Open-MPI documents their network support here. MPICH lists this information in the README distributed with each version (e.g. this is for 3.2.1). Note that because both Open-MPI and MPICH support the OFI (aka libfabric) networking layer, they support many of the same networks. However, libfabric is a multi-faceted API, so not every network may be supported the same in both (e.g. MPICH has an OFI-based IBM Blue Gene/Q implementation, but I'm not aware of equivalent support in Open-MPI). However, the OFI-based implementations of both MPICH and Open-MPI are working on shared-memory, Ethernet (via TCP/IP), Mellanox InfiniBand, Intel Omni Path, and likely other networks. Open-MPI also supports both of these networks and others natively (i.e. without OFI in the middle).
In the past, a common complaint about MPICH is that it does not support InfiniBand, whereas Open-MPI does. However, MVAPICH and Intel MPI (among others) - both of which are MPICH derivatives - support InfiniBand, so if one is willing to define MPICH as "MPICH and its derivatives", then MPICH has extremely broad network support, including both InfiniBand and proprietary interconnects like Cray Seastar, Gemini and Aries as well as IBM Blue Gene (/L, /P and /Q). Open-MPI also supports the Cray Gemini interconnect, but its usage is not supported by Cray. More recently, MPICH supported InfiniBand through a netmod (now deprecated), but MVAPICH2 has extensive optimizations that make it the preferred implementation in nearly all cases.
Feature Support from the Latest MPI Standard
An orthogonal axis to hardware/platform support is coverage of the MPI standard. Here MPICH is usually far and away superior. MPICH has been the first implementation of every single release of the MPI standard, from MPI-1 to MPI-3. Open-MPI has only recently supported MPI-3 and I find that some MPI-3 features are buggy on some platforms (MPICH is not bug-free, of course, but bugs in MPI-3 features have been far less common).
Historically, Open-MPI has not had holistic support for
MPI_THREAD_MULTIPLE
, which is critical for some applications. It might be supported on some platforms but cannot generally be assumed to work. On the other hand, MPICH has had holistic support forMPI_THREAD_MULTIPLE
for many years, although the implementation is not always high-performance (see "Locking Aspects in Multithreaded MPI Implementations" for one analysis).Another feature that was broken in Open-MPI 1.x was one-sided communication, aka RMA. This has more recently been fixed and I find, as a very heavy user of these features, that they are generally working well in Open-MPI 3.x (see e.g. the ARMCI-MPI test matrix in Travis CI for results showing RMA working with both implementations, at least in shared-memory. I've seen similar positive results on Intel Omni Path, but have not tested Mellanox InfiniBand.
Process Management
One area where Open-MPI used to be significantly superior was the process manager. The old MPICH launch (MPD) was brittle and hard to use. Fortunately, it has been deprecated for many years (see the MPICH FAQ entry for details). Thus, criticism of MPICH because of MPD is spurious.
The Hydra process manager is quite good and has the similar usability and feature set as ORTE (in Open-MPI), e.g. both support HWLOC for control over process topology. There are reports of Open-MPI process launching being faster than MPICH-derivatives for larger jobs (1000+ processes), but since I don't have firsthand experience here, I am not comfortable stating any conclusions. Such performance issues are usually network-specific and sometimes even machine-specific.
I have found Open-MPI to be more robust when using MacOS with a VPN, i.e. MPICH may hang in startup due to hostname resolution issues. As this is a bug, this issue may disappear in the future.
Binary Portability
While both MPICH and Open-MPI are open-source software that can be compiled on a wide range of platforms, the portability of MPI libraries in binary form, or programs linked against them, is often important.
MPICH and many of its derivatives support ABI compatibility (website), which means that the binary interface to the library is constant and therefore one can compile with
mpi.h
from one implementation and then run with another. This is true even across multiple versions of the libraries. For example, I frequently compile Intel MPI butLD_PRELOAD
a development version of MPICH at runtime. One of the big advantages of ABI compatibility is that ISVs (Independent Software Vendors) can release binaries compiled against only one member of the MPICH family.ABI is not the only type of binary compatibility. The scenarios described above assume that users employ the same version of the MPI launcher (usually
mpirun
ormpiexec
, along with its compute-node daemons) and MPI library everywhere. This is not necessarily the case for containers.While Open-MPI does not promise ABI compatibility, they have invested heavily in supporting containers (docs, slides). This requires great care in maintaining compatibility across different versions of the MPI launcher, launcher daemons, and MPI Library, because a user may launch jobs using a newer version of the MPI launcher than the launcher daemons in the container support. Without careful attention to launcher interface stability, container jobs will not launch unless the versions of each component of the launcher are compatible. This is not an insurmountable problem:
I acknowledge Ralph Castain of the Open-MPI team for explaining the container issues to me. The immediately preceding quote is his.
Platform-Specific Comparison
Here is my evaluation on a platform-by-platform basis:
Mac OS: both Open-MPI and MPICH should work just fine. To get the latest features of the MPI-3 standard, you need to use a recent version of Open-MPI, which is available from Homebrew. There is no reason to think about MPI performance if you're running on a Mac laptop.
Linux with shared-memory: both Open-MPI and MPICH should work just fine. If you want a release version that supports all of MPI-3 or MPI_THREAD_MULTIPLE, you probably need MPICH though, unless you build Open-MPI yourself, because e.g. Ubuntu 16.04 only provides the ancient version 1.10 via APT. I am not aware of any significant performance differences between the two implementations. Both support single-copy optimizations if the OS allows them.
Linux with Mellanox InfiniBand: use Open-MPI or MVAPICH2. If you want a release version that supports all of MPI-3 or
MPI_THREAD_MULTIPLE
, you likely need MVAPICH2 though. I find that MVAPICH2 performs very well but haven't done a direct comparison with OpenMPI on InfiniBand, in part because the features for which performance matters most to me (RMA aka one-sided) have been broken in Open-MPI in the past.Linux with Intel Omni Path (or its predecessor, True Scale): I have use MVAPICH2, Intel MPI, MPICH and Open-MPI on such systems, and all are working. Intel MPI tends to the most optimized while Open-MPI delivered the best performance of the open-source implementations because they have a well-optimized PSM2-based back-end. I have some notes on GitHub on how to build different open-source implementations, but such information goes stale rather quickly.
Cray or IBM supercomputers: MPI comes installed on these machines automatically and it is based upon MPICH in both cases. There have been demonstrations of MPICH on Cray XC40 (here) using OFI, Intel MPI on Cray XC40 (here) using OFI, MPICH on Blue Gene/Q using OFI (here), and Open-MPI on Cray XC40 using both OFI and uGNI (here), but none of these are vendor supported.
Windows: I see no point in running MPI on Windows except through a Linux VM, but both Microsoft MPI and Intel MPI support Windows and are MPICH-based. I have heard reports of successful builds of MPICH or Open-MPI using Windows Subsystem for Linux but have no personal experience.
Notes
In full disclosure, I currently work for Intel in a research/pathfinding capacity (i.e. I do not work on any Intel software products) and formerly worked for Argonne National Lab for five years, where I collaborated extensively with the MPICH team.
如果您从事开发而不是生产系统,请选择 MPICH。 MPICH 有内置调试器,而 Open-MPI 没有,我上次检查过。
在生产中,Open-MPI 很可能会更快。但您可能想研究其他替代方案,例如英特尔 MPI。
If you do development rather than production system, go with MPICH. MPICH has built-in debugger, while Open-MPI does not last time I checked.
In production, Open-MPI most likely will be faster. But then you may want to research other alternatives, such as Intel MPI.
我同意之前的海报。尝试两者,看看您的应用程序在哪一个上运行得更快,然后将其用于生产。它们都符合标准。如果这是您的桌面,则都可以。 OpenMPI 在 Macbook 上开箱即用,而 MPICH 似乎对 Linux/Valgrind 更友好。它位于您和您的工具链之间。
如果它是生产集群,您需要进行更广泛的基准测试,以确保它针对您的网络拓扑进行优化。在生产集群上配置它将是您时间方面的主要区别,因为您必须使用 RTFM。
I concur with the previous poster. Try both to see which one your application runs faster on then use it for production. They are both standards compliant. If it is your desktop either is fine. OpenMPI comes out of the box on Macbooks, and MPICH seems to be more Linux/Valgrind friendly. It is between you and your toolchain.
If it is a production cluster you need to do more extensive benchmarking to make sure it is optimized to your network topology. Configuring it on a production cluster will be the main difference in terms of your time as you will have to RTFM.
两者都符合标准,因此从正确性的角度来看,使用哪一个并不重要。除非您需要某些功能(例如特定的调试扩展),否则对两者进行基准测试并选择适合您的硬件上的应用程序的速度更快的一个。另请考虑,还有其他 MPI 实现可能会提供更好的性能或兼容性,例如 MVAPICH(可以具有最佳的 InfiniBand 性能)或 Intel MPI(广泛支持的 ISV)。 HP 也努力让他们的 MPI 获得大量 ISV 代码的资格,但我不确定它被出售到 Platform 后表现如何......
Both are standards-compliant, so it shouldn't matter which you use from a correctness point of view. Unless there is some feature, such as specific debug extensions, that you need, then benchmark both and pick whichever is faster for your apps on your hardware. Also consider that there are other MPI implementations that might give better performance or compatibility, such as MVAPICH (can have the best InfiniBand performance) or Intel MPI (widely supported ISVs). HP worked hard to get their MPI qualified with lots of ISV codes too, but I'm not sure how it is faring after being sold on to Platform...
根据我的经验,OpenMPI 支持但 MPICH 不支持的一项好功能是进程亲和性。例如,在 OpenMPI 中,使用
-npersocket
您可以设置在每个套接字上启动的排名数量。此外,当您想要精确定位核心的排名或超额订阅它们时,OpenMPI 的rankfile
非常方便。最后,如果您需要控制等级到核心的映射,我绝对建议使用 OpenMPI 编写和编译代码。
From my experience one good feature that OpenMPI supports but MPICH does not is process affinity. For example, in OpenMPI, using
-npersocket
you can set the number of ranks launched on each socket. Also, OpenMPI'srankfile
is quite handy when you want to pinpoint ranks to cores or oversubscribe them.Last, if you need to control the mapping of ranks to cores, I would definitely suggest writing and compiling your code using OpenMPI.