ARPACK 线程安全吗?
在编写的程序中同时从不同线程使用 ARPACK 特征求解器是否安全在C语言中?或者,如果 ARPACK 本身不是线程安全的,那么是否有一个与 API 兼容的线程安全实现?快速的 Google 搜索没有找到任何有用的信息,但考虑到 ARPACK 在大型科学计算中大量使用,我发现成为第一个需要线程安全的稀疏特征求解器的人非常令人惊讶。
我对Fortran不太熟悉,所以我用f2c
将ARPACK源代码翻译成C,看起来静态变量还不少。基本上,翻译例程中的所有局部变量似乎都是静态的,这意味着库本身不是线程安全的。
Is it safe to use the ARPACK eigensolver from different threads at the same time from a program written in C? Or, if ARPACK itself is not thread-safe, is there an API-compatible thread-safe implementation out there? A quick Google search didn't turn up anything useful, but given the fact that ARPACK is used heavily in large scientific calculations, I'd find it highly surprising to be the first one who needs a thread-safe sparse eigensolver.
I'm not too familiar with Fortran, so I translated the ARPACK source code to C using f2c
, and it seems that there are quite a few static variables. Basically, all the local variables in the translated routines seem to be static, implying that the library itself is not thread-safe.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Fortran 77 不支持递归,因此符合标准的编译器可以在程序的数据部分分配所有变量;原则上,既不需要栈也不需要堆[1]。
可能这就是 f2c 正在做的事情,如果是这样,则可能是 f2c 步骤而不是程序本身使程序成为非线程安全的。当然,正如其他人提到的,也请检查 COMMON 块。 编辑:另外,检查是否有明确的 SAVE 指令。 SAVE 意味着变量的值应该在过程的后续调用之间保留,类似于 C 中的 static。现在,在数据段中分配所有过程本地数据使得所有变量隐式 SAVE,不幸的是,有很多旧的即使 Fortran 标准不保证这一点,代码也会假设这一点。显然,这样的代码不是线程安全的。沃特。具体来说,我不能保证任何东西,但 ARPACK 普遍受到好评并被广泛使用,所以如果它遇到这些灰尘甲板问题,我会感到惊讶。
大多数现代 Fortran 编译器都使用堆栈分配。您可能会更幸运地使用 gfortran 和 -frecursive 选项编译 ARPACK。
编辑:
[1] 不是因为它更高效,而是因为 Fortran 最初是在堆栈和堆发明之前设计的,并且出于某种原因,标准委员会希望保留在硬件上实现 Fortran 的选项堆栈或堆支持一直到 Fortran 90。实际上,我猜想堆栈在当今严重依赖缓存的硬件上比访问分布在整个数据部分的过程本地数据更有效。
Fortran 77 does not support recursion, and hence a standard conforming compiler can allocate all variables in the data section of the program; in principle, neither a stack nor a heap is needed [1].
It might be that this is what f2c is doing, and if so, it might be that it's the f2c step that makes the program non thread-safe, rather than the program itself. Of course, as others have mentioned, check out for COMMON blocks as well. EDIT: Also, check for explicit SAVE directives. SAVE means that the value of the variable should be retained between subsequent invocations of the procedure, similar to static in C. Now, allocating all procedure local data in the data section makes all variables implicitly SAVE, and unfortunately, there is a lot of old code that assumes this even though it's not guaranteed by the Fortran standard. Such code, obviously, is not thread-safe. Wrt. ARPACK specifically, I can't promise anything but ARPACK is generally well regarded and widely used so I'd be surprised if it suffered from these kinds of dusty-deck problems.
Most modern Fortran compilers do use stack allocation. You might have better luck compiling ARPACK with, say, gfortran and the -frecursive option.
EDIT:
[1] Not because it's more efficient, but because Fortran was originally designed before stacks and heaps were invented, and for some reason the standards committee wanted to retain the option to implement Fortran on hardware with neither stack nor heap support all the way up to Fortran 90. Actually, I'd guess that stacks are more efficient on todays heavily cache-dependent hardware rather than accessing procedure local data that is spread all over the data section.
我已使用
f2c
将 ARPACK 转换为 C。每当您使用f2c
并且您关心线程安全时,您必须使用-a
开关。这使得局部变量具有自动存储,即基于堆栈的局部变量而不是默认的静态变量。即便如此,ARPACK 本身也绝对不是线程安全的。它使用许多公共块(即全局变量)来保存对其函数的不同调用之间的状态。如果没记错的话,它使用反向通信接口,这往往会导致开发人员使用全局变量。当然,ARPACK 可能是在多线程普及之前很久就编写的。
我最终重新处理了转换后的 C 代码,系统地删除了所有全局变量。我创建了一些 C 结构体,并逐渐将全局变量移至这些结构体中。最后,我将指向这些结构的指针传递给需要访问这些变量的每个函数。尽管我可以在需要的地方将每个全局变量转换为参数,但将它们全部放在一起并包含在结构中要干净得多。
本质上,这个想法是将全局变量转换为局部变量。
I have converted ARPACK to C using
f2c
. Whenever you usef2c
and you care about thread-safety you must use the-a
switch. This makes local variables have automatic storage, i.e. be stack based locals rather than statics which is the default.Even so, ARPACK itself is decidedly not threadsafe. It uses a lot of common blocks (i.e. global variables) to preserve state between different calls to its functions. If memory serves, it uses a reverse communication interface which tends to lead developers to using global variables. And of course ARPACK probably was written long before multi-threading was common.
I ended up re-working the converted C code to systematically remove all the global variables. I created a handful of C structs and gradually moved the global variables into these structs. Finally I passed pointers to these structs to each function that needed access to those variables. Although I could just have converted each global into a parameter wherever it was needed it was much cleaner to keep them all together, contained in structs.
Essentially the idea is to convert global variables into local variables.
ARPACK 使用 BLAC 对吗?那么这些库也需要是线程安全的。
我相信你用 f2c 检查的想法可能不是判断 Fortran 代码是否线程安全的万无一失的方法,我猜它还取决于 Fortran 编译器和库。
ARPACK uses BLAC right? Then those libraries need to be thread safe too.
I believe your idea to check with f2c might not be a bullet proof way of telling if the Fortran code is thread safe, I would guess it also depends on the Fortran compiler and libraries.
我不知道f2c在翻译Fortran时使用什么策略。由于 ARPACK 是用 FORTRAN 77 编写的,因此要做的第一件事是检查 COMMON 块是否存在。这些是全局变量,如果使用,代码很可能不是线程安全的。 ARPACK 网页,http://www.caam.rice.edu/software/ARPACK/,表示有一个并行版本——该版本似乎是线程安全的。
I don't know what strategy f2c uses in translating Fortran. Since ARPACK is written in FORTRAN 77, the first thing to do is check for the presence of COMMON blocks. These are global variables, and if used, the code is most likely not thread safe. The ARPACK webpage, http://www.caam.rice.edu/software/ARPACK/, says that there is a parallel version -- it seems likely that that version is threadsafe.