在分配缓冲区空间以接收自定义数据类型时，是否应该参考真实数据类型范围？

发布于 2025-01-10 21:02:48 字数 5246 浏览 3 评论 0原文

我想从一个进程发送一个边长为 N 的方形网格到另一个进程，该进程将填充一个具有相同行数但在左侧多一列的矩形网格（我想暂时保持未初始化状态）），因此有 N 行和 N+1 列。如下面的代码所示，我利用三种不同的自定义数据类型在接收端间隔行（也用于此处未显示的代码的其他部分），一种对发送方矩阵 (row_t) 中由 N MPI_CHAR 组成的行进行建模，一种对发送方矩阵 (row_t) 中由 N MPI_CHAR 组成的行进行建模通过 MPI_Type_create_resized (row_t_res) 将其大小调整到 N+1 字节的范围，并且通过 MPI_Type_contigously 调用将后者 N 次复制到连续位置（ext_part_t）。

我的问题是：假设我开始从第一行的第二个（索引为1）元素填充接收器矩阵（使用MPI_Recv）（以保持左列的第一个元素像属于它的其他元素一样未初始化），如何我必须在接收方分配许多元素以避免溢出？ N*(N+1)（所以指的是通过MPI_get_true_extent得到的ext_part_t的真实extent，即N*(N+1)-1）或N*(N+1)+1（指的是通过MPI_get_extent得到的extent，即 N*(N+1))？即使没有填充数据， row_t_res 的最后一个（最后一个单元）块也会跨越接收器分配的数组右边界，我不知道 MPI 是否在内部访问它。

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

typedef signed char cell_type;

void init_domain(cell_type *grid, int N)
{
   int i;
   for (i = 0; i < N * N; i++)
   {
       grid[i] = i;
   }
}

int main(int argc, char *argv[])
{
   int N = 4;

   cell_type *matrix = NULL;

   int my_rank, comm_sz;

   MPI_Init(&argc, &argv);
   MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
   MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);

   MPI_Datatype row_t, row_t_res, ext_part_t;

   if (0 == my_rank && 2 != comm_sz)
   {
       fprintf(stderr, "Program must be run with exactly 2 processes.\n");
       MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
   }

   if (0 == my_rank)
   {
       const size_t sender_matrix_size = (N * N) * sizeof(cell_type);
       matrix = (cell_type *)malloc(sender_matrix_size);

       init_domain(matrix, N);
   }

   MPI_Type_contiguous(N, MPI_CHAR, &row_t);
   MPI_Type_commit(&row_t);

   MPI_Type_create_resized(row_t,       //oldtype
                           0,           //lower bound
                           N + 1,       //extent
                           &row_t_res); //newtype
   MPI_Type_commit(&row_t_res);

   MPI_Type_contiguous(N,            //count
                       row_t_res,    //oldtype
                       &ext_part_t); //newtype
   MPI_Type_commit(&ext_part_t);

   if (0 == my_rank)
   {
       MPI_Send(matrix,          //const void *buf,
                N,               //int count,
                row_t,           //datatype
                1,               //dest
                0,               //tag
                MPI_COMM_WORLD); //comm
   }
   else
   {
       /*
                       +---+---+---+---+
       row_t           | X | X | X | X |
                       +---+---+---+---+
                       ^------ N ------^

                       +---+---+---+---+---+
       row_t_res       | X | X | X | X |   |
                       +---+---+---+---+---+
                       ^------ N ------^   
                       ^------  N+1  ------^

       ext_part_t:
                           +---+---+---+---+
                           | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | 
                       +---+

what I have to allocate at receiving side:
                       FIRST_RECV_CELL_IDX
                             |
                       +---+-v-+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+

                       |   | ?
                       +---+
*/

       MPI_Aint l, extent, true_extent;
       MPI_Type_get_extent(ext_part_t, &l, &extent);
       MPI_Type_get_true_extent(ext_part_t, &l, &true_extent);

       int receiver_matrix_rows = N;
       int receiver_matrix_cols = N + 1;

       const int FIRST_RECV_CELL_IDX = 1;

       const size_t receiver_matrix_size =
           //(FIRST_RECV_CELL_IDX + true_extent)
           receiver_matrix_rows * receiver_matrix_cols;

       //OR:
       //(FIRST_RECV_CELL_IDX + extent)
       //= receiver_matrix_rows * receiver_matrix_cols + FIRST_RECV_CELL_IDX; ?

       matrix = (cell_type *)malloc(receiver_matrix_size);

       MPI_Recv(&matrix[FIRST_RECV_CELL_IDX], //buf
                1,                            //count
                ext_part_t,                   //datatype
                0,                            //source
                0,                            //tag
                MPI_COMM_WORLD,               //comm
                MPI_STATUS_IGNORE             //status
       );

       /* Using the matrix extended with a left (uninitialized for now) 
       column... */
   }

   free(matrix);
   MPI_Type_free(&row_t);
   MPI_Type_free(&row_t_res);
   MPI_Type_free(&ext_part_t);

   MPI_Finalize();

   return EXIT_SUCCESS;
}

使用不同的 N 值运行该代码从未向我显示这样的错误，但也许这只是因为这种情况（我使用的是 OpenMPI 4.0.3 安装；代码可以使用 mpicc -std=c99 -Wall -Wpedantic 进行编译-O2 程序.c -o 程序）。提前致谢。

原文

I want to send a square grid with side length N from a process to another process which will fill with it a rectangular grid with the same number of rows but extended with one more column at the left (which I want to keep uninitialized for the moment), so having N rows and N+1 columns.
Like shown in the code below, I leverage three different custom datatypes for spacing rows at receiving side (also used in other parts of the code not shown here), one modeling a row made of N MPI_CHAR in the sender matrix (row_t), one got by resizing it to an extent of N+1 bytes by MPI_Type_create_resized (row_t_res), and one consisting in the replication of the latter N times into contiguous locations by MPI_Type_contiguous call (ext_part_t).

My questioni is: assuming I start to fill the receiver matrix (with MPI_Recv) from the second (with index 1) element of the first row (as to keep the first element of the left column uninitialized like the others belonging to it), how many elements I have to allocate at receiver side for avoiding overflow? N*(N+1) (so referring to the true extent of ext_part_t obtained with MPI_get_true_extent, that is N*(N+1)-1) or N*(N+1) +1 (referring to extent got by MPI_get_extent, that is N*(N+1))?
Even if not filled with data, the last (cell of the last) block of row_t_res would cross receiver’s allocated array right boundary and I don’t know if MPI internally accesses it.

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

typedef signed char cell_type;

void init_domain(cell_type *grid, int N)
{
   int i;
   for (i = 0; i < N * N; i++)
   {
       grid[i] = i;
   }
}

int main(int argc, char *argv[])
{
   int N = 4;

   cell_type *matrix = NULL;

   int my_rank, comm_sz;

   MPI_Init(&argc, &argv);
   MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
   MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);

   MPI_Datatype row_t, row_t_res, ext_part_t;

   if (0 == my_rank && 2 != comm_sz)
   {
       fprintf(stderr, "Program must be run with exactly 2 processes.\n");
       MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
   }

   if (0 == my_rank)
   {
       const size_t sender_matrix_size = (N * N) * sizeof(cell_type);
       matrix = (cell_type *)malloc(sender_matrix_size);

       init_domain(matrix, N);
   }

   MPI_Type_contiguous(N, MPI_CHAR, &row_t);
   MPI_Type_commit(&row_t);

   MPI_Type_create_resized(row_t,       //oldtype
                           0,           //lower bound
                           N + 1,       //extent
                           &row_t_res); //newtype
   MPI_Type_commit(&row_t_res);

   MPI_Type_contiguous(N,            //count
                       row_t_res,    //oldtype
                       &ext_part_t); //newtype
   MPI_Type_commit(&ext_part_t);

   if (0 == my_rank)
   {
       MPI_Send(matrix,          //const void *buf,
                N,               //int count,
                row_t,           //datatype
                1,               //dest
                0,               //tag
                MPI_COMM_WORLD); //comm
   }
   else
   {
       /*
                       +---+---+---+---+
       row_t           | X | X | X | X |
                       +---+---+---+---+
                       ^------ N ------^

                       +---+---+---+---+---+
       row_t_res       | X | X | X | X |   |
                       +---+---+---+---+---+
                       ^------ N ------^   
                       ^------  N+1  ------^

       ext_part_t:
                           +---+---+---+---+
                           | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | 
                       +---+

what I have to allocate at receiving side:
                       FIRST_RECV_CELL_IDX
                             |
                       +---+-v-+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+
                       |   | X | X | X | X |
                       +---+---+---+---+---+

                       |   | ?
                       +---+
*/

       MPI_Aint l, extent, true_extent;
       MPI_Type_get_extent(ext_part_t, &l, &extent);
       MPI_Type_get_true_extent(ext_part_t, &l, &true_extent);

       int receiver_matrix_rows = N;
       int receiver_matrix_cols = N + 1;

       const int FIRST_RECV_CELL_IDX = 1;

       const size_t receiver_matrix_size =
           //(FIRST_RECV_CELL_IDX + true_extent)
           receiver_matrix_rows * receiver_matrix_cols;

       //OR:
       //(FIRST_RECV_CELL_IDX + extent)
       //= receiver_matrix_rows * receiver_matrix_cols + FIRST_RECV_CELL_IDX; ?

       matrix = (cell_type *)malloc(receiver_matrix_size);

       MPI_Recv(&matrix[FIRST_RECV_CELL_IDX], //buf
                1,                            //count
                ext_part_t,                   //datatype
                0,                            //source
                0,                            //tag
                MPI_COMM_WORLD,               //comm
                MPI_STATUS_IGNORE             //status
       );

       /* Using the matrix extended with a left (uninitialized for now) 
       column... */
   }

   free(matrix);
   MPI_Type_free(&row_t);
   MPI_Type_free(&row_t_res);
   MPI_Type_free(&ext_part_t);

   MPI_Finalize();

   return EXIT_SUCCESS;
}

Running that code with different N values never showed me such an error, but maybe it is just because of the case (I’m using an OpenMPI 4.0.3 installation; the code can be compiled with mpicc -std=c99 -Wall -Wpedantic -O2 program.c -o program). Thanks in advance.

分享到QQ

分享到微博