MPI 程序使用的 tcp 连接数 (MPICH2+nemesis+tcp)
如果使用的 MPI 是 MPICH2,MPI 程序将使用多少个 tcp 连接来发送数据?如果您还了解 pmi 连接,请单独计算它们。
例如,如果我有 4 个进程和另外 2 个通信器(COMM1 用于第 1 和第 2 个进程,COMM2 用于第 3 和第 4 个进程);数据在每个可能的进程对之间发送;在每一个可能的沟通者中。
我使用最近的 MPICH2 + Hydra + 默认 pmi。操作系统是linux,网络是交换以太网。每个进程都在单独的 PC 上。
所以,这里是数据路径(在进程对中):
1 <-> 2 (in MPI_COMM_WORLD and COMM1)
1 <-> 3 (only in MPI_COMM_WORLD)
1 <-> 4 (only in MPI_COMM_WORLD)
2 <-> 3 (only in MPI_COMM_WORLD)
2 <-> 4 (only in MPI_COMM_WORLD)
3 <-> 4 (in MPI_COMM_WORLD and COMM2)
我认为可能存在
- 情况 1:
仅使用 6 个 tcp 连接; COMM1 和 MPI_COMM_WORLD 中发送的数据将混合在单个 tcp 连接中。
- 情况 2:
8 个 tcp 连接:MPI_COMM_WORLD 中的 6 个(所有到所有 = 全网状)+ 1 个用于 1 <-> 2
在 COMM1 + 1 中,对于 3 <-> 4
在 COMM2 中
- 还有我没有想到的其他变体。
How much tcp connections will be used for sending data by MPI program if the MPI used is MPICH2? If you know also about pmi connections, count them separately.
For example, if I have 4 processes and additional 2 Communicators (COMM1 for 1st and 2nd processes and COMM2 for 3rd and 4rd); the data is sent between each possible pair of processes; in every possible communicator.
I use recent MPICH2 + hydra + default pmi. OS is linux, network is switched Ethernet. Every process in on separated PC.
So, here are pathes of data (in pairs of processes):
1 <-> 2 (in MPI_COMM_WORLD and COMM1)
1 <-> 3 (only in MPI_COMM_WORLD)
1 <-> 4 (only in MPI_COMM_WORLD)
2 <-> 3 (only in MPI_COMM_WORLD)
2 <-> 4 (only in MPI_COMM_WORLD)
3 <-> 4 (in MPI_COMM_WORLD and COMM2)
I think there can be
- Case 1:
Only 6 tcp connections will be used; data sent in COMM1 and MPI_COMM_WORLD will be mixed in the single tcp connection.
- Case 2:
8 tcp connections: 6 in MPI_COMM_WORLD (all-to-all = full mesh) + 1 for 1 <-> 2
in COMM1 + 1 for 3 <-> 4
in COMM2
- other variant that I didn't think about.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用哪些通信器不会影响已建立的 TCP 连接数。对于
--with-device=ch3:nemesis:tcp
(默认配置),您将在通过点对点 MPI 例程直接通信的每对进程之间使用一个双向 TCP 连接。在您的示例中,这意味着 6 个连接。如果您使用集合体,那么在幕后可能会建立额外的连接。连接将仅根据需要延迟建立,但一旦建立,它们将保持建立状态,直到调用MPI_Finalize
(有时也调用MPI_Comm_disconnect
)。我突然不知道每个 PMI 进程使用了多少个连接,尽管我相当确定每个 MPI 进程应该有一个连接到 Hydra_pmi_proxy 进程的连接,再加上一些连接
Hydra_pmi_proxy
和mpiexec
进程之间的其他连接数(可能是对数)。Which communicators are being used doesn't affect the number of TCP connections that are established. For
--with-device=ch3:nemesis:tcp
(the default configuration), you will use one bidirectional TCP connection between each pair of processes that directly communicate via point-to-point MPI routines. In your example, this means 6 connections. If you use collectives then under the hood additional connections may be established. Connections will be established lazily, only as needed, but once established they will stay established untilMPI_Finalize
(and sometimes alsoMPI_Comm_disconnect
) is called.Off the top of my head I don't know how many connections are used by each process for PMI, although I'm fairly sure it should be one per MPI process connecting to the
hydra_pmi_proxy
processes, plus some other number (probably logarithmic) of connections among thehydra_pmi_proxy
andmpiexec
processes.我无法完全回答你的问题,但这里有一些需要考虑的事情。在 PMI 的 MAPICH2 中,我们开发了一种基于树的连接机制。因此每个节点最多有 log (n) 个 TCP 连接。由于打开套接字会使您受到大多数操作系统上打开文件描述符的限制,因此 MPI 库很可能会在队列上使用逻辑拓扑来限制 TCP 连接的数量。
I can't answer your question completely, but here's something to consider. In MVAPICH2 for the PMI we developed a tree based connection mechanism. So each node would have log (n) TCP connections at the max. Since opening a socket would subject you to the open file descriptor limit on most OSes, its probable that the MPI library would use a logical topology over the ranks to limit the number of TCP connections.