没有使用OpenMP SIMD的加速和ICC和GCC之间的不同结果
我是OpenMP的新手,现在尝试使用OpenMP + SIMD Interins来加速我的程序,但结果远非期望。
/* program:simd.c */
#include<stdio.h>
#include<stdlib.h>
#include<omp.h>
#include<math.h>
#define M 10000
int main() {
float a[M],b[M];
double t1,t2;
t1 = omp_get_wtime();
for(int j = 0; j<M;j++)
#pragma omp simd
for(int i = 0; i<M;i++){
a[i]=log(pow(2.71828,(pow(sin(pow(1.1,1.1)),1.1)+1.0))+j);
b[i]=cos(log(pow(2.71828,(pow(sin(pow(1.1,1.1)),1.1)+1.0))+j));
}
t2 = omp_get_wtime();
printf("simd time = %lfs\n",t2-t1);
printf("a[10] = %f ,b[10] = %f\n\n",a[10],b[10]);
t1 = omp_get_wtime();
for(int j = 0; j<M;j++)
for(int i = 0; i<M;i++){
a[i]=log(pow(2.71828,(pow(sin(pow(1.1,1.1)),1.1)+1.0))+j);
b[i]=cos(log(pow(2.71828,(pow(sin(pow(1.1,1.1)),1.1)+1.0))+j));
}
t2 = omp_get_wtime();
printf("time = %lfs\n",t2-t1);
printf("a[10] = %f ,b[10] = %f\n\n",a[10],b[10]);
return 0;
}
我使用WSL2运行代码
GCC版本7.5.0(Ubuntu 7.5.0-3ubuntu1〜18.04)
结果几乎相同
另一件事使我感到困惑:为什么使用ICC而不是GCC,它运行更快: 300倍!!!
ivan@LAPTOP-JQJBOOBT:~$ icc simd.c -qopenmp -o simd
ivan@LAPTOP-JQJBOOBT:~$ ./simd
simd time = 0.026405s
a[10] = 9.210899 ,b[10] = -0.977215
time = 0.026401s
a[10] = 9.210899 ,b[10] = -0.977215
希望您能帮助我弄清楚为什么或给我一些建议,我将感谢这一点。
I am new to Openmp and now trying to use Openmp + SIMD intrinsics to speedup my program, but the result is far from expectation.
/* program:simd.c */
#include<stdio.h>
#include<stdlib.h>
#include<omp.h>
#include<math.h>
#define M 10000
int main() {
float a[M],b[M];
double t1,t2;
t1 = omp_get_wtime();
for(int j = 0; j<M;j++)
#pragma omp simd
for(int i = 0; i<M;i++){
a[i]=log(pow(2.71828,(pow(sin(pow(1.1,1.1)),1.1)+1.0))+j);
b[i]=cos(log(pow(2.71828,(pow(sin(pow(1.1,1.1)),1.1)+1.0))+j));
}
t2 = omp_get_wtime();
printf("simd time = %lfs\n",t2-t1);
printf("a[10] = %f ,b[10] = %f\n\n",a[10],b[10]);
t1 = omp_get_wtime();
for(int j = 0; j<M;j++)
for(int i = 0; i<M;i++){
a[i]=log(pow(2.71828,(pow(sin(pow(1.1,1.1)),1.1)+1.0))+j);
b[i]=cos(log(pow(2.71828,(pow(sin(pow(1.1,1.1)),1.1)+1.0))+j));
}
t2 = omp_get_wtime();
printf("time = %lfs\n",t2-t1);
printf("a[10] = %f ,b[10] = %f\n\n",a[10],b[10]);
return 0;
}
I use wsl2 to run the code
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
the result is almost same
another thing puzzles me : why use icc rather than gcc it runs faster:
300 times faster!!!
ivan@LAPTOP-JQJBOOBT:~$ icc simd.c -qopenmp -o simd
ivan@LAPTOP-JQJBOOBT:~$ ./simd
simd time = 0.026405s
a[10] = 9.210899 ,b[10] = -0.977215
time = 0.026401s
a[10] = 9.210899 ,b[10] = -0.977215
Hope you can help me figure out why or give me some advice, I'll be grateful for that!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论