MPI 程序使用的 tcp 连接数 (MPICH2+nemesis+tcp)

发布于 2024-12-19 06:22:50 字数 866 浏览 3 评论 0原文

如果使用的 MPI 是 MPICH2，MPI 程序将使用多少个 tcp 连接来发送数据？如果您还了解 pmi 连接，请单独计算它们。

例如，如果我有 4 个进程和另外 2 个通信器（COMM1 用于第 1 和第 2 个进程，COMM2 用于第 3 和第 4 个进程）；数据在每个可能的进程对之间发送；在每一个可能的沟通者中。

我使用最近的 MPICH2 + Hydra + 默认 pmi。操作系统是linux，网络是交换以太网。每个进程都在单独的 PC 上。

所以，这里是数据路径（在进程对中）：

1 <-> 2 (in MPI_COMM_WORLD and COMM1)
1 <-> 3 (only in MPI_COMM_WORLD)
1 <-> 4 (only in MPI_COMM_WORLD)
2 <-> 3 (only in MPI_COMM_WORLD)
2 <-> 4 (only in MPI_COMM_WORLD)
3 <-> 4 (in MPI_COMM_WORLD and COMM2)

我认为可能存在

情况 1：

仅使用 6 个 tcp 连接； COMM1 和 MPI_COMM_WORLD 中发送的数据将混合在单个 tcp 连接中。

情况 2：

8 个 tcp 连接：MPI_COMM_WORLD 中的 6 个（所有到所有 = 全网状）+ 1 个用于 1 <-> 2 在 COMM1 + 1 中，对于 3 <-> 4 在 COMM2 中

还有我没有想到的其他变体。

原文

How much tcp connections will be used for sending data by MPI program if the MPI used is MPICH2? If you know also about pmi connections, count them separately.

For example, if I have 4 processes and additional 2 Communicators (COMM1 for 1st and 2nd processes and COMM2 for 3rd and 4rd); the data is sent between each possible pair of processes; in every possible communicator.

I use recent MPICH2 + hydra + default pmi. OS is linux, network is switched Ethernet. Every process in on separated PC.

So, here are pathes of data (in pairs of processes):

1 <-> 2 (in MPI_COMM_WORLD and COMM1)
1 <-> 3 (only in MPI_COMM_WORLD)
1 <-> 4 (only in MPI_COMM_WORLD)
2 <-> 3 (only in MPI_COMM_WORLD)
2 <-> 4 (only in MPI_COMM_WORLD)
3 <-> 4 (in MPI_COMM_WORLD and COMM2)

I think there can be

Case 1:

Only 6 tcp connections will be used; data sent in COMM1 and MPI_COMM_WORLD will be mixed in the single tcp connection.

Case 2:

8 tcp connections: 6 in MPI_COMM_WORLD (all-to-all = full mesh) + 1 for 1 <-> 2 in COMM1 + 1 for 3 <-> 4 in COMM2

other variant that I didn't think about.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梨涡少年 2024-12-26 06:22:50

使用哪些通信器不会影响已建立的 TCP 连接数。对于 --with-device=ch3:nemesis:tcp （默认配置），您将在通过点对点 MPI 例程直接通信的每对进程之间使用一个双向 TCP 连接。在您的示例中，这意味着 6 个连接。如果您使用集合体，那么在幕后可能会建立额外的连接。连接将仅根据需要延迟建立，但一旦建立，它们将保持建立状态，直到调用 MPI_Finalize（有时也调用 MPI_Comm_disconnect）。

我突然不知道每个 PMI 进程使用了多少个连接，尽管我相当确定每个 MPI 进程应该有一个连接到 Hydra_pmi_proxy 进程的连接，再加上一些连接Hydra_pmi_proxy 和 mpiexec 进程之间的其他连接数（可能是对数）。

回复收藏 0 原文

掩于岁月 2024-12-26 06:22:50

我无法完全回答你的问题，但这里有一些需要考虑的事情。在 PMI 的 MAPICH2 中，我们开发了一种基于树的连接机制。因此每个节点最多有 log (n) 个 TCP 连接。由于打开套接字会使您受到大多数操作系统上打开文件描述符的限制，因此 MPI 库很可能会在队列上使用逻辑拓扑来限制 TCP 连接的数量。

回复收藏 0 原文

~没有更多了~