创建一个 C++ Python 的 MPI 模块,导入错误
我有一个包含 C++ 库的 Python 模块。该库使用 MPI 并使用 mpicxx 进行编译。在某些机器上一切正常,但在其他机器上我得到了这个:
ImportError: ./_pyCombBLAS.so: undefined symbol: _ZN3MPI3Win4FreeEv
所以 MPI 库中有一个未定义的符号。据我所知 mpicxx 应该链接所有内容,但事实并非如此。有什么想法吗?
I have a Python module which wraps a C++ library. The library uses MPI and is compiled with mpicxx. Everything works great on some machines, but on others I get this:
ImportError: ./_pyCombBLAS.so: undefined symbol: _ZN3MPI3Win4FreeEv
So there's an undefined symbol from the MPI library. As far as I can tell mpicxx should link everything in, yet it doesn't. Any ideas why?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
事实证明,错误是加载了错误的库。
如您所知,一个集群可能安装了多个版本的 MPI,有时同一版本会使用多个编译器进行编译。这些文件可能都具有相同的文件名。就我而言,即使我使用 MPICH GNU 进行编译,默认路径也是 OpenMPI PGI 库。我没有意识到这一点,我认为使用 MPICH GNU 编译意味着将在运行时找到 MPICH GNU 库。
当然,我实际上无法使用 PGI 编译的 OpenMPI,因为 Python 是使用 GCC 编译的,而 PGI 不会生成与 GCC 完全兼容的二进制文件。
解决方案是设置 LD_LIBRARY 环境变量以匹配用于编译代码的 MPI 发行版。
It turns out that the fault was that the wrong libraries were being loaded.
As you know, a cluster is likely to have several versions of MPI installed, sometimes the same version is compiled with several compilers. These are all likely to have the same filenames. In my case even though I compiled with, say MPICH GNU, the default path was to the OpenMPI PGI libraries. I didn't realize this, I thought compiling with MPICH GNU would mean the MPICH GNU libraries would be found at runtime.
Of course I couldn't actually use PGI-compiled OpenMPI because Python was compiled with GCC and PGI doesn't emit binaries fully compatible with GCC.
The solution is to set the LD_LIBRARY environment variable to match the MPI distribution you used to compile your code.
这是一个共享库问题。尝试在扩展模块工作的系统和失败的系统上运行 ldd。
这应该会向您显示您的扩展所依赖的所有库,以便您可以确保它们可用。
解决这个问题的一个简单方法可能是将依赖项静态链接到您的扩展中。
It's a shared library problem. Try running ldd on the extension module on both the system where it works and the system where it fails.
This should show you all the libraries your extension depends on so you can make sure they're available.
A simple way to work around it may be to statically link the dependencies into your extension.
符号
ZN3MPI3Win4FreeEv
定义为libmpi_cxx.so
,因此必须使用-lmpi_cxx
而不是-lmpi
链接symbol
ZN3MPI3Win4FreeEv
is defined islibmpi_cxx.so
, so one have to link with-lmpi_cxx
instead of-lmpi