tensorflow中有GPU,但在火炬中不可用
我目前在服务器上工作,我希望能够进行Pytorch网络培训的GPU。我无法通过使用火炬检测GPU,但是,如果使用TensorFlow,我可以检测到我应该拥有的两个GPU。我想这是Pytorch/Tensorflow和CUDA版本中版本的问题。
但是,在尝试了不同版本的Pytorch之后,我仍然无法使用它们……
我正在附上GPU的特殊性以及当前版本的Tensorflow和我正在使用的Pytorch。有人有暗示吗?会很有帮助。
| NVIDIA-SMI 4--.--.-- Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------|
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:02:00.0 Off | N/A |
| 27% 39C P8 17W / 250W | 1MiB / 11176MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:81:00.0 Off | N/A |
| 28% 45C P8 11W / 250W | 1MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
火炬
版本:1.10.2 TensorFlow
版本:2.6.2 CUDA TOOLKIT
:11.3.1
>>> print('Number of GPUs: %d' % len(tf.config.list_physical_devices('GPU')))
Number of GPUs: 2
>>> torch.cuda.is_available()
False
我很迷失...谢谢您!
I'm currently working on a server and I would like to be able the GPUs for PyTorch network training. I am not able to detect GPU by using torch but, if I use TensorFlow, I can detect both of the GPUs I am supposed to have. I suppose it's a problem with versions within PyTorch/TensorFlow and the CUDA versions on it.
However, after trying different versions of Pytorch, I am not still able to use them...
I am attaching the specificities of the GPUs and the current version of Tensorflow and Pytorch I am using. Does anyone have any hint on it? Would be very helpful.
| NVIDIA-SMI 4--.--.-- Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------|
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:02:00.0 Off | N/A |
| 27% 39C P8 17W / 250W | 1MiB / 11176MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:81:00.0 Off | N/A |
| 28% 45C P8 11W / 250W | 1MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
Torch
version: 1.10.2Tensorflow
Version: 2.6.2Cuda toolkit
: 11.3.1
>>> print('Number of GPUs: %d' % len(tf.config.list_physical_devices('GPU')))
Number of GPUs: 2
>>> torch.cuda.is_available()
False
I am so lost... Thank you in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
执行
PIP列表
,看看您下载的火炬版本是否如下,Torch 1.11.0 +Cu113
如果没有 +cuxxx,则您可能未启用CUDA下载火炬。
我有同样的问题,下面的安装命令对我有用。 (在Ubuntu上)
Do a
pip list
and see if your downloaded torch version looks like this,torch 1.11.0+cu113
If there is no +cuXXX then you probably downloaded torch without cuda enabled.
I had the same problem and the install command below worked for me. (On Ubuntu)
我终于可以通过指定Pytorch的CUDA版本来解决此问题...这些特定版本的组合是安装基于CPU的版本。
安装正确的该服务器后,我能够使用GPU服务器毫无问题。
I finally could resolve this problem by specifying the cuda version of pytorch... The combination of those specific versions was installing the CPU based version.
After installing the correct one, I have been able to use the GPU server without any problem.