为什么编译器忽略 OpenMP 编译指示?

发布于 2024-10-14 00:12:49 字数 2740 浏览 3 评论 0 原文

在以下 C 代码中,我在嵌套循环中使用 OpenMP。由于发生竞争条件,我想在最后执行原子操作:

double mysumallatomic() {

  double S2 = 0.;
  #pragma omp parallel for shared(S2)
  for(int a=0; a<128; a++){
    for(int b=0; b<128;b++){
      double myterm = (double)a*b;
      #pragma omp atomic
      S2 += myterm;
    }
  }
  return S2;
}

问题是 #pragma ompatomic 对程序行为没有影响,即使我删除它,也不会发生任何事情。即使我将其更改为 #pragma oh_my_god,我也没有收到任何错误!

我想知道这里出了什么问题,我是否可以告诉编译器在检查 omp pragmas 时更加严格,或者为什么我在进行最后一次更改时没有收到错误

PS:对于我使用的编译:

gcc-4.2 -fopenmp main.c functions.c -o main_elec_gcc.exe

PS2:给我的新代码同样的问题,基于吉莱斯皮的想法:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <omp.h>
#include <math.h>

#define NRACK 64
#define NSTARS 1024

double mysumallatomic_serial(float rocks[NRACK][3], float moon[NSTARS][3],
                             float qr[NRACK],float ql[NSTARS]) {
  int j,i;
  float temp_div=0.,temp_sqrt=0.;
  float difx,dify,difz;
  float mod2x, mod2y, mod2z;
  double S2 = 0.;

  for(j=0; j<NRACK; j++){
    for(i=0; i<NSTARS;i++){     
      difx=rocks[j][0]-moon[i][0];
      dify=rocks[j][1]-moon[i][1];
      difz=rocks[j][2]-moon[i][2];
      mod2x=difx*difx;
      mod2y=dify*dify;
      mod2z=difz*difz;
      temp_sqrt=sqrt(mod2x+mod2y+mod2z);
      temp_div=1/temp_sqrt;
      S2 += ql[i]*temp_div*qr[j];       
    }
  }
  return S2;
}

double mysumallatomic(float rocks[NRACK][3], float moon[NSTARS][3], 
                      float qr[NRACK],float ql[NSTARS]) {
  float temp_div=0.,temp_sqrt=0.;
  float difx,dify,difz;
  float mod2x, mod2y, mod2z;
  double S2 = 0.;

  #pragma omp parallel for shared(S2)
  for(int j=0; j<NRACK; j++){
    for(int i=0; i<NSTARS;i++){
      difx=rocks[j][0]-moon[i][0];
      dify=rocks[j][1]-moon[i][1];
      difz=rocks[j][2]-moon[i][2];
      mod2x=difx*difx;
      mod2y=dify*dify;
      mod2z=difz*difz;
      temp_sqrt=sqrt(mod2x+mod2y+mod2z);
      temp_div=1/temp_sqrt;
      float myterm=ql[i]*temp_div*qr[j];    
      #pragma omp atomic
      S2 += myterm;
    }
  }
  return S2;
}
int main(int argc, char *argv[]) {
  float rocks[NRACK][3], moon[NSTARS][3];
  float qr[NRACK], ql[NSTARS];
  int i,j;

  for(j=0;j<NRACK;j++){
    rocks[j][0]=j;
    rocks[j][1]=j+1;
    rocks[j][2]=j+2;
    qr[j] = j*1e-4+1e-3;
    //qr[j] = 1;
  }

  for(i=0;i<NSTARS;i++){
    moon[i][0]=12000+i;
    moon[i][1]=12000+i+1;
    moon[i][2]=12000+i+2;
    ql[i] = i*1e-3 +1e-2 ;
    //ql[i] = 1 ;
  }
  printf(" serial: %f\n", mysumallatomic_serial(rocks,moon,qr,ql));
  printf(" openmp: %f\n", mysumallatomic(rocks,moon,qr,ql));
  return(0);
}

In the following C code I am using OpenMP in a nested loop. Since race condition occurs, I want to perform atomic operations at the end:

double mysumallatomic() {

  double S2 = 0.;
  #pragma omp parallel for shared(S2)
  for(int a=0; a<128; a++){
    for(int b=0; b<128;b++){
      double myterm = (double)a*b;
      #pragma omp atomic
      S2 += myterm;
    }
  }
  return S2;
}

The thing is that #pragma omp atomic has no effect on the program behaviour, even if I remove it, nothing happens. Even if I change it to #pragma oh_my_god, I get no error!

I wonder what is going wrong here, whether I can tell the compiler to be more strict when checking omp pragmas or why I do not get an error when I make the last change

PS: For compilation I use:

gcc-4.2 -fopenmp main.c functions.c -o main_elec_gcc.exe

PS2: New code that gives me the same problem and based on gillespie idea:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <omp.h>
#include <math.h>

#define NRACK 64
#define NSTARS 1024

double mysumallatomic_serial(float rocks[NRACK][3], float moon[NSTARS][3],
                             float qr[NRACK],float ql[NSTARS]) {
  int j,i;
  float temp_div=0.,temp_sqrt=0.;
  float difx,dify,difz;
  float mod2x, mod2y, mod2z;
  double S2 = 0.;

  for(j=0; j<NRACK; j++){
    for(i=0; i<NSTARS;i++){     
      difx=rocks[j][0]-moon[i][0];
      dify=rocks[j][1]-moon[i][1];
      difz=rocks[j][2]-moon[i][2];
      mod2x=difx*difx;
      mod2y=dify*dify;
      mod2z=difz*difz;
      temp_sqrt=sqrt(mod2x+mod2y+mod2z);
      temp_div=1/temp_sqrt;
      S2 += ql[i]*temp_div*qr[j];       
    }
  }
  return S2;
}

double mysumallatomic(float rocks[NRACK][3], float moon[NSTARS][3], 
                      float qr[NRACK],float ql[NSTARS]) {
  float temp_div=0.,temp_sqrt=0.;
  float difx,dify,difz;
  float mod2x, mod2y, mod2z;
  double S2 = 0.;

  #pragma omp parallel for shared(S2)
  for(int j=0; j<NRACK; j++){
    for(int i=0; i<NSTARS;i++){
      difx=rocks[j][0]-moon[i][0];
      dify=rocks[j][1]-moon[i][1];
      difz=rocks[j][2]-moon[i][2];
      mod2x=difx*difx;
      mod2y=dify*dify;
      mod2z=difz*difz;
      temp_sqrt=sqrt(mod2x+mod2y+mod2z);
      temp_div=1/temp_sqrt;
      float myterm=ql[i]*temp_div*qr[j];    
      #pragma omp atomic
      S2 += myterm;
    }
  }
  return S2;
}
int main(int argc, char *argv[]) {
  float rocks[NRACK][3], moon[NSTARS][3];
  float qr[NRACK], ql[NSTARS];
  int i,j;

  for(j=0;j<NRACK;j++){
    rocks[j][0]=j;
    rocks[j][1]=j+1;
    rocks[j][2]=j+2;
    qr[j] = j*1e-4+1e-3;
    //qr[j] = 1;
  }

  for(i=0;i<NSTARS;i++){
    moon[i][0]=12000+i;
    moon[i][1]=12000+i+1;
    moon[i][2]=12000+i+2;
    ql[i] = i*1e-3 +1e-2 ;
    //ql[i] = 1 ;
  }
  printf(" serial: %f\n", mysumallatomic_serial(rocks,moon,qr,ql));
  printf(" openmp: %f\n", mysumallatomic(rocks,moon,qr,ql));
  return(0);
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

美人骨 2024-10-21 00:12:50
  1. 使用标志-Wall突出显示编译指示错误。例如,当我拼错 atomic 时,我会收到以下警告。

    main.c:15: warning: ignoring #pragma ompatomic1

  2. 我相信你知道,但为了以防万一,你的示例应该用reduction来处理

  3. 当您使用 ompparallel 时,默认情况下所有变量都将被共享。在你的情况下,这不是你想要的。例如,每个线程将具有不同的值difx。相反,你的循环应该是:

    #pragma ompparallel for default(none),\
    私人(difx,dify,difz,mod2x,mod2y,mod2z,temp_sqrt,temp_div,i,j),\
    共享(岩石,月亮,ql,qr),减少(+:S2)
    for(j=0; j
  1. Using the flag -Wall highlights pragma errors. For example, when I misspell atomic I get the following warning.

    main.c:15: warning: ignoring #pragma omp atomic1

  2. I'm sure you know, but just in case, your example should be handled with a reduction

  3. When you use omp parallel, the default is for all variables to be shared. This is not what you want in your case. For example, each thread will have a different value difx. Instead, your loop should be:

    #pragma omp parallel for default(none),\
    private(difx, dify, difz, mod2x, mod2y, mod2z, temp_sqrt, temp_div, i, j),\
    shared(rocks, moon, ql, qr), reduction(+:S2)
    for(j=0; j<NRACK; j++){
      for(i=0; i<NSTARS;i++){
        difx=rocks[j][0]-moon[i][0];
        dify=rocks[j][1]-moon[i][1];
        difz=rocks[j][2]-moon[i][2];
        mod2x=difx*difx;
        mod2y=dify*dify;
        mod2z=difz*difz;
        temp_sqrt=sqrt(mod2x+mod2y+mod2z);
        temp_div=1/temp_sqrt;
        S2 += ql[i]*temp_div*qr[j];  
      }
    }
    
只有影子陪我不离不弃 2024-10-21 00:12:50

我知道这是一篇旧帖子,但我认为问题是 gcc 的参数顺序, -fopenmp 应该位于编译行的末尾。

I know this is an old post, but I think the problem is the order of the parameters of gcc, -fopenmp should be at the end of the compilation line.

喜爱皱眉﹌ 2024-10-21 00:12:50

首先,根据实现的不同,缩减可能比使用原子更好。我会尝试两者并计时以确保确定。

其次,如果您忽略原子性,您可能会也可能看不到与竞争相关的问题(错误结果)。这一切都与时间有关,从一次跑步到下一次跑步可能会有很大不同。我见过 150,000 次运行中结果只有一次错误的情况,还有一些情况一直都是错误的。

第三,编译指示背后的想法是,如果它们没有效果,用户就不需要了解它们。除此之外,Unix(及其衍生版本)的哲学是,除非出现问题,否则它是安静的。话虽如此,许多实现都有某种标志,因此用户可以获得更多信息,因为他们不知道发生了什么。您可以尝试使用 gcc 的 -Wall ,至少它应该将 oh_my_god pragma 标记为被忽略。

First, depending on the implementation, reduction might be better than using atomic. I would try both and time them to see for sure.

Second, if you leave off the atomic, you may or may not see the problem (wrong result) associated with the race. It is all about timing, which from one run to the next can be quite different. I have seen cases where the result was wrong only once in 150,000 runs and others where it has been wrong all the time.

Third, the idea behind pragmas was that the user doesn't need to know about them if they don't have an effect. Besides that, the philosophy in Unix (and its derivatives) is that it is quiet unless there is a problem. Saying that, many implementations have some sort of flag so the user can get more information because they didn't know what was happening. You can try -Wall with gcc, and at least it should flag the oh_my_god pragma as being ignored.

一个人的旅程 2024-10-21 00:12:50

所以

#pragma omp parallel for shared(S2)
  for(int a=0; a<128; a++){
   ....

唯一的并行化将是 for 循环。

如果你想要原子或还原
你必须这样做

#pragma omp parallel 
{
 #pragma omp for shared(S2)
   for(int a=0; a<128; a++){
     for(int b=0; b<128;b++){
       double myterm = (double)a*b;
       #pragma omp atomic
        S2 += myterm;
     } // end of second for
   } // end of 1st for
} // end of parallel code
return S2;
} // end of function

否则 # 之后的所有内容都会被评论

You have

#pragma omp parallel for shared(S2)
  for(int a=0; a<128; a++){
   ....

So the only parallelization will be to the for loop.

If you want to have the atomic or reduction
you have to do

#pragma omp parallel 
{
 #pragma omp for shared(S2)
   for(int a=0; a<128; a++){
     for(int b=0; b<128;b++){
       double myterm = (double)a*b;
       #pragma omp atomic
        S2 += myterm;
     } // end of second for
   } // end of 1st for
} // end of parallel code
return S2;
} // end of function

Otherwise everything after # will be comment

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文