MPI 大数据全部传输

发布于 2024-09-05 06:36:04 字数 374 浏览 12 评论 0原文

我的 MPI 应用程序有一些生成一些大数据的过程。假设我们有N+1个进程(一个用于主控,其他是工作进程),每个工作进程都会生成大量数据,现在只需将其写入普通文件,名为file1,file2,...,fileN。每个文件的大小可能有很大不同。现在我需要将所有 fileM 发送到排名 M 的进程来完成下一个工作,所以这就像所有到所有的数据传输一样。

我的问题是我应该如何使用 MPI API 有效地发送这些文件?我以前曾经使用Windows共享文件夹来传输这些,但我认为这不是一个好主意。

我考虑过 MPI_file 和 MPI_All_to_all,但这些函数似乎不太适合我的情况。简单的MPI_Send和MPI_Recv似乎很难使用,因为每个进程都需要传输大量数据,而且我暂时不想使用分布式文件系统。

My application of MPI has some process that generate some large data. Say we have N+1 process (one for master control, others are workers), each of worker processes generate large data, which is now simply write to normal file, named file1, file2, ..., fileN. The size of each file may be quite different. Now I need to send all fileM to rank M process to do the next job, So it's just like all to all data transfer.

My problem is how should I use MPI API to send these files efficiently? I used to use windows share folder to transfer these before, but I think it's not a good idea.

I have think about MPI_file and MPI_All_to_all, but these functions seems not to be so suitable for my case. Simple MPI_Send and MPI_Recv seems hard to be used because every process need to transfer large data, and I don't want to use distributed file system for now.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

翻了热茶 2024-09-12 06:36:04

如果没有更多数据,只有您现在拥有的数据,就不可能准确回答您的问题。因此,这里有一些一般性,您必须考虑它们,看看是否以及如何将它们应用到您的情况中。

  • 如果您的流程正在生成大型数据集,那么它们不太可能立即生成。您可能不想考虑等到整个数据集创建完成,而是考虑逐块传输它。
  • 我不认为 MPI_Send 和 _Recv (或它们的变体)很难用于大量数据。但是您需要考虑一下在进程之间的每次通信中找到正确的传输量。对于 MPI,这并不是一个简单的情况,即消息启动时间加上适用于所有发送消息的消息传输速率。例如,某些 IBM 实现在其某些硬件上对于小型和大型消息具有不同的延迟和带宽。但是,您必须自己弄清楚适合您的平台的带宽和延迟之间的权衡。我在这里给出的唯一一般性建议是参数化消息大小并进行实验,直到最大化计算与通信的比率。
  • 顺便说一句,您应该已经完成​​的测试之一是测量平台上各种大小和通信模式的消息传输速率。当您开始使用新系统时,这是一种基本的试运行测试。如果您没有更合适的东西,STREAMS 基准将帮助您入门。
  • 我认为,在通常使用 MPI 的程序中,大量数据的全面传输是一种不寻常的情况。您可能需要认真考虑重新设计您的应用程序以避免此类传输。当然,只有你知道这是否可行或值得。从您提供的少量信息来看,您似乎正在实施某种管道;在这种情况下,通常的通信模式是从进程 0 到进程 1、进程 1 到进程 2、2 到 3 等。
  • 最后,如果您碰巧在具有共享内存的计算机(例如多核 PC)上工作,可能会考虑使用共享内存方法,例如 OpenMP,以避免传递大量数据。

It's not possible to answer your question precisely without a lot more data, data that only you have right now. So here are some generalities, you'll have to think about them and see if and how to apply them in your situation.

  • If your processes are generating large data sets they are unlikely to be doing so instantaneously. Instead of thinking about waiting until the whole data set is created, you might want to think about transferring it chunk by chunk.
  • I don't think that MPI_Send and _Recv (or the variations on them) are hard to use for large amounts of data. But you need to give some thought to finding the right amount to transfer in each communication between processes. With MPI it is not a simple case of there being a message startup time plus a message transfer rate which apply to all messages sent. Some IBM implementations, for example, on some of their hardware had different latencies and bandwidths for small and large messages. However, you have to figure out for yourself what the tradeoffs between bandwidth and latency are for your platform. The only general advice I would give here is to parameterise the message sizes and experiment until you maximise the ratio of computation to communication.
  • As an aside, one of the tests you should already have done is measured message transfer rates for a wide range of sizes and communications patterns on your platform. That's kind of a basic shake-down test when you start work on a new system. If you don't have anything more suitable, the STREAMS benchmark will help you get started.
  • I think that a all-to-all transfers of large amounts of data is an unusual scenario in the kinds of programs for which MPI is typically used. You may want to give some serious thought to redesigning your application to avoid such transfers. Of course, only you know if that is feasible or worthwhile. From what little information your provide it seems as if you might be implementing some kind of pipeline; in such cases the usual pattern of communication is from process 0 to process 1, process 1 to process 2, 2 to 3, etc.
  • Finally, if you happen to be working on a computer with shared memory (such as a multicore PC) you might think about using a shared memory approach, such as OpenMP, to avoid passing large amounts of data around.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文