使用与我的NVIDIA群集使用不同的cudatoolkit时，如何修复软件包依赖关系？

发布于 2025-02-07 07:48:44 字数 1328 浏览 3 评论 0原文

我正在使用一个包装的软件包，该软件包需要tensorflow-gpu == 2.0.0，而cuda = 10.0.0 with cudann == 7.6.0

我在nvidia gpu cluster上运行此代码，当我运行nvidia-smi时，它显示 this 。它仍然显示CUDA 11，我猜这是实际服务器上安装的一个。

有人告诉我，我可以通过在我需要的版本中安装cudatoolkit来基本上“覆盖”此版本。我这样做了，并安装了cudatoolkit == 10.0。

不幸的是，当尝试使用TensorFlow-GPU运行基于LSTM的模型时，我现在遇到了一个问题。我得到以下内容：

2022-06-14 17:02:26.988359: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989175: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989208: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

在路径中，我仍然看到CUDA11。这会导致问题吗？我该如何解决？

原文

I am using a package that requires tensorflow-gpu == 2.0.0 and CUDA=10.0.0 with cudann==7.6.0

I am running this code on a NVIDIA gpu cluster and when I run nvidia-smi it shows
this. It still shows cuda 11, which I guess is the one installed on the actually server.

I was told that I can basically 'override' this version by installing the cudatoolkit in the version that I need. I did that and installed cudatoolkit==10.0.

Unfortunately I am now running into a problem when trying to run an LSTM based model with tensorflow-gpu. I get the following:

2022-06-14 17:02:26.988359: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989175: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989208: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

In the path I still see cuda11. Is this causing the problem? How can I resolve this?

分享到QQ

分享到微博