使用非常大的数组时出现 MPI 分段错误
我正在尝试用 C++ 编写一个 MPI 程序,对一个非常大的数组的值进行求和。 下面的代码在数组维度高达 100 万的情况下运行良好,但是当我尝试使用 1000 万个或更多元素执行时,我收到了分段错误。有人可以帮助我吗?谢谢
#include <stdio.h>
#include "mpi.h"
int main(int argc, char *argv[]) {
double t0, t1, time; //variabili per il calcolo del tempo
int nprocs, myrank;
int root=0;
long temp, sumtot, i, resto, svStartPos, dim, intNum;
//Dimensione del vettore contenente i valori da sommare
const long A_MAX=10000000;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
long vett[A_MAX];
long parsum[B_MAX];
long c=-1;
int displs[nprocs];
int sendcounts[nprocs];
//printf("B_MAX: %ld\n", B_MAX);
//Inviamo (int)(A_MAX/nprocs) elementi tramite una scatter, resto è il
//numero di elementi restanti che verranno inviati tramite la scatterv
resto= A_MAX % nprocs;
//printf("Resto: %d\n", resto);
//Posizione da cui iniziare lo Scatterv
svStartPos = A_MAX - resto;
//printf("svStartPos: %d\n", svStartPos);
// numero di elementi per processore senza tener conto del resto
dim= (A_MAX-resto)/nprocs;
//printf("dim: %d\n", dim);
//Il processore 0 inizializza il vettore totale, del quale vogliamo
//calcolare la somma
if (myrank==0){
for (i=0; i<A_MAX; i++)
vett[i]=1;
}
//Ciascun processore inizializza il vettore locale del quale calcoleremo la
//somma parziale dei suoi elementi. tale somma parziale verrà utilizzata
//nell'operazione di reduce
for (i=0; i<B_MAX; i++)
parsum[i]=-1;
//Ciascun processore inizializza i vettori sendcounts e displs necessari per
//l'operazione di scatterv
for (i=0; i<nprocs; i++){
if (i<A_MAX-svStartPos){
//Se il rank del processore è compreso tra 0 e resto ...
sendcounts[i]=1; //...verrà inviato 1 elemento di vett...
displs[i]= svStartPos+i; //...di posizione svStartPos+i
}
else {
//se il rank del processore è > resto ...
sendcounts[i]=0; //...non verrà inviato alcun elemento
displs[i]= A_MAX;
}
}
root = 0; //Il processore master
sumtot = 0; //Valore della domma totale degli elementi di vett
temp = 0; //valore temporaneo delle somme parziali
MPI_Barrier(MPI_COMM_WORLD);
if (A_MAX>=nprocs){
MPI_Scatter(&vett[dim*myrank], dim, MPI_LONG, &parsum, dim, MPI_LONG, 0, MPI_COMM_WORLD);
printf("Processore: %d - Scatter\n", myrank);
}
//La scatterv viene effettuata solo dai processori che hanno il rank
//0<myrank<resto
if (sendcounts[myrank]==1){
MPI_Scatterv(&vett,sendcounts,displs,MPI_LONG,&c,1,MPI_LONG,0,MPI_COMM_WORLD);
parsum[B_MAX-1]=c;
printf("Processore: %d - effettuo la Scatterv\n", myrank);
}
MPI_Barrier(MPI_COMM_WORLD);
if(myrank==0){
t0 = MPI_Wtime(); //inizio conteggio tempo
}
for(i=0; i<B_MAX; i++){
if (parsum[i]!=-1)
temp = temp + parsum[i]; //somma degli elementi
}
printf("Processore: %d - Somma parziale: %ld\n", myrank, temp);
MPI_Barrier(MPI_COMM_WORLD);
//il risultato di somma di ogni processore viene mandato al root che somma
//i risultati parziali
MPI_Reduce(&temp,&sumtot,1,MPI_LONG,MPI_SUM,root,MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
if(myrank==0){
t1 = MPI_Wtime(); //stop al tempo
//calcolo e stampa del tempo trascorso
time = 1.e6 * (t1-t0);
printf("NumProcessori: %d Somma: %ld Tempo: %f\n", nprocs, sumtot, time);
//verifica del valore somma. Se è corretto sumtot è pari a 0.
sumtot = sumtot - A_MAX;
printf("Verifica: %ld\n", sumtot);
}
MPI_Finalize();
return 0;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我发现的第一个真正的错误是这一行:
它将
::std::vector
的地址传递给需要该参数中的void*
的函数。任何指针类型(例如::std::vector*
)到void*
的转换都允许作为隐式转换,因此不会出现编译错误这一点。但是, MPI_Scatterv 期望其第一个参数是发送缓冲区的地址,MPI 期望它是一个普通数组。我猜您最近从注释掉的部分更改了代码,其中
vett
是一个数组,并尝试通过在MPI_Scatterv
中添加 address-of 运算符来使您的调用正常工作> 打电话。原始数组可能在某个时刻导致段错误,因为它是堆栈分配的,并且您用完了这些怪物的堆栈空间(Linux 系统上的默认堆栈大小约为兆字节 iirc,这完全符合该假设 - 测试这一点ulimit -s)。对
::std::vector
的更改导致实际数据被放置在堆上,而堆的最大大小要大得多(在 64 位系统上,您可能会用完物理内存更早)。实际上,您已经在前面几行实现了特定问题的解决方案:在这里,您访问一个元素,然后获取其地址(请注意
[]
比&
绑定更紧密)。只要您不修改底层向量
,就可以。如果您只是将该解决方案应用于之前的调用,您就可以很容易地解决这个问题:在任何情况下,除了两个
vector
对象之外,您的代码看起来像是为旧的 C 标准编写的,不是 C++ - 例如,您可能会考虑查看诸如新
运算符系列而不是malloc.h
,您可以将变量声明与其定义保持一致(甚至在for
循环标头!),使用ostream
cout 而不是printf
...The first real error I found was this line:
Which passes the address of an
::std::vector<int>
to a function that expects avoid*
in that argument. The conversion of any pointer type (like::std::vector<int>*
) tovoid*
is allowed as an implicit conversion, so there are no compile errors at this point. However, MPI_Scatterv expects its first argument to be the address of the send buffer, which MPI expects to be a normal array.I guess that you changed your code recently from the commented out sections, where
vett
is an array and tried to get your call to work by adding the address-of operator in yourMPI_Scatterv
call. The original array probably caused segfaults at some point since it was stack-allocated and you ran out of stack space with those monsters (default stack size on linux systems is on the order of megabytes iirc, which would exactly fit that assumption - test this with ulimit -s).The change to
::std::vector<int>
caused the actual data to be placed on the heap instead, which has a much larger maximum size (and on 64 bit systems you can expect to run out of physical memory much earlier). You actually already implemented a solution to your particular problem a few lines earlier:Here, you access an element and then take its address (note that
[]
binds tighter than&
). This is O.K. as long as you do not modify the underlyingvector
. If you just apply that solution to the previous call, you can solve this problem quite easily:In any case, except for the two
vector
objects, your code looks like it was written for the old C standard, not C++ - for example you might consider having a look into things like thenew
operator family instead ofmalloc.h
, you can put your variable declarations in line with their definitions (even insidefor
loop headers!), ease your life with using theostream
cout instead ofprintf
...在我看来,该程序是一个 C 程序,因为您没有使用任何 C++ 工具或任何标头(可能是 cstdio,没有 .h)。
不管怎样,你能用标准分配替换数组分配 A[非常大的数字]吗?如果你想要 C,malloc,否则,new。然后发布结果。
这似乎是一个堆分配问题(http://c-faq.com/strangeprob/biglocal.html)。
让我知道。
The program seems to me a C one, since you are not using any C++ facility, or any header (which would have been cstdio, without .h).
Anyway, can you please replace the array allocation, A[very huge number], with a standard allocation? If you want C, malloc, otherwise, new. Then post the results.
This seems to be a heap allocation problem (http://c-faq.com/strangeprob/biglocal.html).
Let me know.