线程化时的数据损坏矢量统计库-数学核心库

发布于 2024-07-16 03:49:57 字数 1341 浏览 5 评论 0原文

我刚刚并行化了一个模拟个人行为的 Fortran 例程，并且在使用矢量统计库（数学内核库的一个库）生成随机数时遇到了一些问题。该程序的结构如下：

program example
...
!$omp parallel do num_threads(proc) default(none) private(...) shared(...)
do i=1,n
call firstroutine(...)
enddo
!$omp end parallel do
...
end program example

subroutine firstroutine
...
call secondroutine(...)
...
end subroutine

subroutine secondroutine
...
VSL calls
...
end subroutine

我使用 Intel Fortran 编译器进行编译，其 makefile 如下所示：

f90comp = ifort
libdir = /home
mklpath = /opt/intel/mkl/10.0.5.025/lib/32/
mklinclude = /opt/intel/mkl/10.0.5.025/include/
exec: Example.o Firstroutine.o Secondroutine.o
      $(f90comp) -O3 -fpscomp logicals -openmp -o  aaa -L$(mklpath) -I$(mklinclude) Example.o -lmkl_ia32 -lguide -lpthread
Example.o: $(libdir)Example.f90
       $(f90comp) -O3 -fpscomp logicals -openmp -c $(libdir)Example.f90
Firstroutine.o: $(libdir)Firstroutine.f90
       $(f90comp) -O3 -fpscomp logicals -openmp -c $(libdir)Firstroutine.f90
Secondroutine.o: $(libdir)Secondroutine.f90
       $(f90comp) -O3 -fpscomp logicals -openmp -c -L$(mklpath) -I$(mklinclude) $(libdir)Secondroutine.f90  -lmkl_ia32 -lguide -lpthread

编译时一切正常。当我运行用它生成变量的程序时，一切似乎都工作正常。然而，有时（例如每 200-500 次迭代一次），它会在几次迭代中生成疯狂的数字，然后以正常方式再次运行。我还没有找到这种腐败何时发生的任何模式。

知道为什么会发生这种情况吗？

原文

I've just parallelized a fortran routine that simulates individuals behavior and I've had some problems when generating random numbers with Vector Statistical Library (a library from the Math Kernel Library). The structure of the program is the following:

program example
...
!$omp parallel do num_threads(proc) default(none) private(...) shared(...)
do i=1,n
call firstroutine(...)
enddo
!$omp end parallel do
...
end program example

subroutine firstroutine
...
call secondroutine(...)
...
end subroutine

subroutine secondroutine
...
VSL calls
...
end subroutine

I use the Intel Fortran Compiler for the compilation with a makefile that looks as follows:

f90comp = ifort
libdir = /home
mklpath = /opt/intel/mkl/10.0.5.025/lib/32/
mklinclude = /opt/intel/mkl/10.0.5.025/include/
exec: Example.o Firstroutine.o Secondroutine.o
      $(f90comp) -O3 -fpscomp logicals -openmp -o  aaa -L$(mklpath) -I$(mklinclude) Example.o -lmkl_ia32 -lguide -lpthread
Example.o: $(libdir)Example.f90
       $(f90comp) -O3 -fpscomp logicals -openmp -c $(libdir)Example.f90
Firstroutine.o: $(libdir)Firstroutine.f90
       $(f90comp) -O3 -fpscomp logicals -openmp -c $(libdir)Firstroutine.f90
Secondroutine.o: $(libdir)Secondroutine.f90
       $(f90comp) -O3 -fpscomp logicals -openmp -c -L$(mklpath) -I$(mklinclude) $(libdir)Secondroutine.f90  -lmkl_ia32 -lguide -lpthread

At compilation time everything works fine. When I run my program generating variables with it, everything seems to work fine. However, from time to time (say once each 200-500 iterations), it generates crazy numbers for a couple of iterations and then runs again in a normal way. I have not found any patern to when does this corruption happen.

Any idea on why is it happening?

分享到QQ

分享到微博