二维数组的 MPI 数据类型

发布于 2024-08-28 23:17:05 字数 97 浏览 4 评论 0原文

我需要将一个整数数组的数组(基本上是一个二维数组)从根传递给所有处理器。我在 C 程序中使用 MPI。如何声明二维数组的 MPI 数据类型以及如何发送消息(我应该使用广播还是分散)

I need to pass an array of integer arrays (basically a 2 d array )to all the processors from root.I am using MPI in C programs. How to declare MPI datatype for 2 d array.and how to send the message (should i use broadcast or scatter)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

罪#恶を代价 2024-09-04 23:17:05

您需要使用 Broadcast,因为您想要将同一消息的副本发送到每个进程。 Scatter 分解消息并在进程之间分发块。

至于如何发送数据: HIndexed 数据类型是给你的。

假设你的二维数组是这样定义的:

int N;            // number of arrays (first dimension)
int sizes[N];     // number of elements in each array (second dimensions)
int* arrays[N];   // pointers to the start of each array

首先你必须计算每个数组的起始地址相对于数据类型的起始地址的位移,为了方便起见,可以将其作为第一个数组的起始地址:

MPI_Aint base;
MPI_Address(arrays[0], &base);
MPI_Aint* displacements = new int[N];
for (int i=0; i<N; ++i)
{
    MPI_Address(arrays[i], &displacements[i]);
    displacements[i] -= base;
}

然后定义您的类型将是:

MPI_Datatype newType;
MPI_Type_hindexed(N, sizes, displacements, MPI_INTEGER, &newType);
MPI_Type_commit(&newType);

此定义将创建一种数据类型,其中包含依次打包的所有数组。完成此操作后,您只需将数据作为此类型的单个对象发送:

MPI_Bcast(arrays, 1, newType, root, comm);   // 'root' and 'comm' is whatever you need

但是,您还没有完成。接收进程需要知道您要发送的数组的大小:如果在编译时无法获得该信息,则必须首先发送包含该数据的单独消息(简单的整数数组)。如果Nsizesarrays在接收进程上的定义与上面类似,并且分配了足够的空间来填充数组,那么所有接收进程进程需要做的是定义相同的数据类型(与发送者完全相同的代码),然后将发送者的消息作为该类型的单个实例接收:

MPI_Bcast(arrays, 1, newType, root, comm);    // 'root' and 'comm' must have the same value as in the sender's code

瞧!现在,所有进程都有数组的副本。

当然,如果二维数组的第二维固定为某个值M,事情就会变得容易得多。在这种情况下,最简单的解决方案是将其简单地存储在一个 int[N*M] 数组中:C++ 将保证它都是连续的内存,因此您可以广播它而无需定义自定义数据类型,像这样:

MPI_Bcast(arrays, N*M, MPI_INTEGER, root, comm);

注意:您可能可以使用索引类型而不是 HIndexed。不同之处在于,在 Indexed 中,displacements 数组以元素数量给出,而在 HIndexed 中,它以字节数给出(H 代表异构)。如果您要使用 Indexed,那么 displacements 中给出的值必须除以 sizeof(int)。但是,我不确定在堆上任意位置定义的整数数组是否能保证“对齐”到 C++ 中的整数限制,并且在任何情况下,HIndexed 版本的代码(稍微)较少并产生相同的结果。

You'll need to use Broadcast, because you want to send a copy of the same message to every process. Scatter breaks up a message and distributes the chunks between processes.

As for how to send the data: the HIndexed datatype is for you.

Suppose your 2d array is defined like this:

int N;            // number of arrays (first dimension)
int sizes[N];     // number of elements in each array (second dimensions)
int* arrays[N];   // pointers to the start of each array

First you have to calculate the displacement of each array's starting address, relative to the starting address of the datatype, which can be the starting address of the first array to make things convenient:

MPI_Aint base;
MPI_Address(arrays[0], &base);
MPI_Aint* displacements = new int[N];
for (int i=0; i<N; ++i)
{
    MPI_Address(arrays[i], &displacements[i]);
    displacements[i] -= base;
}

Then the definition for your type would be:

MPI_Datatype newType;
MPI_Type_hindexed(N, sizes, displacements, MPI_INTEGER, &newType);
MPI_Type_commit(&newType);

This definition will create a datatype that contains all your arrays packed one after the other. Once this is done, you just send your data as a single object of this type:

MPI_Bcast(arrays, 1, newType, root, comm);   // 'root' and 'comm' is whatever you need

However, you're not done yet. The receiving processes will need to know the sizes of the arrays you're sending: if that knowledge isn't available at compile time, you'll have to send a separate message with that data first (simple array of ints). If N, sizes and arrays are defined similar as above on the receiving processes, with enough space allocated to fill the arrays, then all the receiving processes need to do is define the same datatype (exact same code as the sender), and then receive the sender's message as a single instance of that type:

MPI_Bcast(arrays, 1, newType, root, comm);    // 'root' and 'comm' must have the same value as in the sender's code

And voilá! All processes now have a copy of your array.

Of course, things get a lot easier if the 2nd dimension of your 2d array is fixed to some value M. In that case, the easiest solution is to simply store it in a single int[N*M] array: C++ will guarantee that it's all contiguous memory, so you can broadcast it without defining a custom datatype, like this:

MPI_Bcast(arrays, N*M, MPI_INTEGER, root, comm);

Note: you might get away with using the Indexed type instead of HIndexed. The difference is that in Indexed, the displacements array is given in number of elements, while in HIndexed it's the number of bytes (H stands for Heterogenous). If you were to use Indexed, then the values given in displacements would have to be divided by sizeof(int). However, I'm not sure if integer arrays defined in arbitrary positions on the heap are guaranteed to "line up" to integer limits in C++, and in any case, the HIndexed version has (marginally) less code and produces the same result.

乙白 2024-09-04 23:17:05

如果您要发送连续的数据块(我认为 C 数组是连续的,但我是一名 Fortran 程序员,不太确定),您不需要声明新的 MPI 数据类型,尽管有一些原因可能会导致您这样做想要。分散用于将数组分布到多个进程中;您可以使用 scatter 将数组的每一行发送到不同的进程。因此,对于连续整数数组的示例,最简单的选择是广播,如下所示(考虑到我糟糕的 C 技能):

MPI_Bcast(&buf, numRows*numCols, MPI_INT, root, MPI_COMM_WORLD)

其中

&buf 是数组中第一个元素的地址

numRows*numCols 当然是二维数组中的元素数量

MPI_INT 是(可能)您将使用

root 的 内在数据类型是广播数组的进程的等级

MPI_COMM_WORLD 是通常的默认通信器,如果需要可以更改

并且不要忘记广播是一个集体操作,所有进程都会进行相同的调用。

如果您的数组不连续,请再次发布一些示例数组大小,我们将弄清楚如何定义 MPI 数据类型。

If you are sending a contiguous block of data (I think C arrays are contiguous, but I'm a Fortran programmer and am not terribly sure) you don't need to declare a new MPI datatype, though there are some reasons why you might want to. Scattering is for distributing, say, an array across a number of processes; you might use scatter to send each row of an array to a different process. So for your example of a contiguous array of integers your simplest option is to broadcast, like this (bearing in mind my poor C skills):

MPI_Bcast(&buf, numRows*numCols, MPI_INT, root, MPI_COMM_WORLD)

where

&buf is the address of the first element in the array

numRows*numCols is, of course, the number of elements in the 2D array

MPI_INT is (probably) the intrinsic datatype you will be using

root is the rank of the process which is broadcasting the array

MPI_COMM_WORLD is the usual default communicator, change if required

And don't forget that broadcasting is a collective operation, all processes make the same call.

If your array is not contiguous, post again with some sample array sizes, and we'll figure out how to define an MPI datatype.

罪歌 2024-09-04 23:17:05
MPI_Send(tempmat,16,MPI_INT,0,0,MPI_COMM_WORLD);

MPI_Recv(resultmaster,16,MPI_INT,MPI_ANY_SOURCE , 0, MPI_COMM_WORLD, &stat);

使用上述 API 时,我只得到矩阵的第一行。

MPI_Send(tempmat,16,MPI_INT,0,0,MPI_COMM_WORLD);

MPI_Recv(resultmaster,16,MPI_INT,MPI_ANY_SOURCE , 0, MPI_COMM_WORLD, &stat);

I am getting only first row of matrix when using the above APIs.

浮华 2024-09-04 23:17:05

您的数组数组不能直接传递到另一个进程,因为虚拟地址可能不同;也就是说,带有指向其他数组的指针的第一维数组在任何其他进程上都没有意义。因此,您必须单独传递每个数组,并在接收器端手动重新组装“二维数组”。

2) 广播与分散。广播将完整的数组发送到通信器中的所有其他 MPI 级别。 Scatter,OTOH,将源数组分布到所有其他 MPI 等级上。即,使用广播时,每个等级接收源数组的副本,使用分散时,每个等级接收数组的不同部分。

Your array of arrays can't be passed directly to another process, because virtual addresses might be different; that is, the first dimension array with the pointers to the other arrays won't make sense on any other process. So you have to pass each array separately, and manually reassemble your "2d array" on the receiver side.

2) Broadcast vs. Scatter. Broadcast sends the complete array to all other MPI ranks in the communicator. Scatter, OTOH, distributes the source array over all the other MPI ranks. I.e. with broadcast each rank receives a copy of the source array, with scatter each rank receives a different part of the array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文