openmp 并行中的浮点异常
我的项目中的一个文件有一个 for 循环,我尝试使用 OpenMP for 并行化该循环。当我运行它时,我遇到了浮点异常。我无法在单独的测试程序中重现该错误,但是,我可以使用虚拟并行区域在同一个文件中重现该错误(原始 for 循环有一些详细的数组计算,因此是虚拟代码):
#pragma omp parallel for
for(i=0; i<8; i++)
{
puts("hello world");
}
我仍然得到相同的结果错误。这是 gdb 输出:
Program received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7ffff4c44710 (LWP 18912)]
0x0000000000402fd4 in allocate_2D_matrix.omp_fn.0 (.omp_data_i=0x0) at main.c:119
119 #pragma omp parallel for
通过反复试验,我通过向 openmp 构造添加一个时间表解决了问题:
#pragma omp parallel for schedule(dynamic)
for(i=0; i<8; i++)
{
puts("hello world");
}
并且它工作得很好。我可以在 2 个不同的系统上复制整个行为(64 位 Linux Mint 上的 gcc 4.4.5 和 64 位 Opensuse 上的 gcc 4.5.0)。 有人知道可能是什么原因造成的吗?我强烈怀疑它与我的程序有关,因为我无法单独重现错误,但我不知道在哪里查看。问题当然解决了,但我很好奇。如果需要,我可以在看到此行为的地方发布整个原始函数。
One of the files in my project has a for loop that I tried to parallelize using OpenMP for. When I ran it, I got a floating point exception. I couldn't reproduce the error in a separate test program, however, I could reproduce it in the same file using a dummy parallel region (the original for loop had some detailed array computations, hence the dummy code):
#pragma omp parallel for
for(i=0; i<8; i++)
{
puts("hello world");
}
I still got the same error. Heres the gdb output:
Program received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7ffff4c44710 (LWP 18912)]
0x0000000000402fd4 in allocate_2D_matrix.omp_fn.0 (.omp_data_i=0x0) at main.c:119
119 #pragma omp parallel for
By trial-and-error, I solved the problem by adding a schedule to the openmp construct:
#pragma omp parallel for schedule(dynamic)
for(i=0; i<8; i++)
{
puts("hello world");
}
and it worked just fine. I could replicate this entire behaviour on 2 different systems (gcc 4.4.5 on 64 bit Linux Mint and gcc 4.5.0 on 64 bit Opensuse).
Would anyone have any ideas as to what might have caused it? I strongly suspect it is related to my program, since I couldn't reproduce the error separately, but I dont know where to look at. The problem is solved of course, but I am curious. If need be, I can post the entire original function where I see this behaviour.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
很可能 put 不是线程安全的。把它放在关键部分,看看会发生什么。
most likely puts isnt thread safe. Stick it in critical section and see what happens.
我遇到了同样的问题,当使用无符号整数作为循环迭代变量时似乎会发生这种情况,这是一个存在问题和修复的示例:
I had the same issue, it seems to happen when using unsigned ints as loop iteration variables, here is an example that has the problem and the fix: