使用与我的NVIDIA群集使用不同的cudatoolkit时,如何修复软件包依赖关系?

发布于 2025-02-07 07:48:44 字数 1328 浏览 3 评论 0原文

我正在使用一个包装的软件包,该软件包需要tensorflow-gpu == 2.0.0,而cuda = 10.0.0 with cudann == 7.6.0

我在nvidia gpu cluster上运行此代码,当我运行nvidia-smi时,它显示 this 。它仍然显示CUDA 11,我猜这是实际服务器上安装的一个。

有人告诉我,我可以通过在我需要的版本中安装cudatoolkit来基本上“覆盖”此版本。我这样做了,并安装了cudatoolkit == 10.0。

不幸的是,当尝试使用TensorFlow-GPU运行基于LSTM的模型时,我现在遇到了一个问题。我得到以下内容:

2022-06-14 17:02:26.988359: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989175: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989208: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

在路径中,我仍然看到CUDA11。这会导致问题吗?我该如何解决?

I am using a package that requires tensorflow-gpu == 2.0.0 and CUDA=10.0.0 with cudann==7.6.0

I am running this code on a NVIDIA gpu cluster and when I run nvidia-smi it shows
this. It still shows cuda 11, which I guess is the one installed on the actually server.

I was told that I can basically 'override' this version by installing the cudatoolkit in the version that I need. I did that and installed cudatoolkit==10.0.

Unfortunately I am now running into a problem when trying to run an LSTM based model with tensorflow-gpu. I get the following:

2022-06-14 17:02:26.988359: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989175: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989208: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

In the path I still see cuda11. Is this causing the problem? How can I resolve this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

森林散布 2025-02-14 07:48:44

如您在注释中提到的,您需要使用Tensorflow 2.1,然后需要安装cudnn 7.6cuda 10.1 10.1

请按照以下经过测试的构建配置TensorFlow版本兼容cudacudnn

请检查此 link 有关GPU设置的更多详细信息。

As you mentioned in the comment you need to use TensorFlow 2.1, then you need to install cuDNN 7.6 and CUDA 10.1 specifically.

Please follow the below tested build configurations to know about Python and TensorFlow versions compatible CUDA and cuDNN .
enter image description here

Please check this link for more details on GPU setup.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文