如何将Pytorch型号移至Apple M1芯片上的GPU？

发布于 2025-02-02 10:58:32 字数 3022 浏览 4 评论 0原文

2022年5月18日，Pytorch noreferrer“ Mac上的Pytorch培训。

我遵循以下过程，在MacBook Air M1（使用Minconda）上设置Pytorch。

conda create -n torch-nightly python=3.8 

$ conda activate torch-nightly

$ pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

我正在尝试从Udacity的深度学习课程中执行脚本在这里。

该脚本使用以下代码将模型移至GPU：

G.cuda()
D.cuda()

但是，由于没有CUDA，因此这对M1芯片不起作用。

如果我们想将模型转移到M1 GPU和我们的张量转移到M1 GPU，并完全在M1 GPU上训练，我们该怎么办？

如果相关：g和d是GAN的歧视器和生成器。

class Discriminator(nn.Module):

    def __init__(self, conv_dim=32):
        super(Discriminator, self).__init__()
        self.conv_dim = conv_dim
        # complete init function
        self.cv1 = conv(in_channels=3, out_channels=conv_dim, kernel_size=4, stride=2, padding=1, batch_norm=False)            # 32*32*3  -> 16*16*32
        self.cv2 = conv(in_channels=conv_dim, out_channels=conv_dim*2, kernel_size=4, stride=2, padding=1, batch_norm=True)    # 16*16*32 -> 8*8*64
        self.cv3 = conv(in_channels=conv_dim*2, out_channels=conv_dim*4, kernel_size=4, stride=2, padding=1, batch_norm=True)  # 8*8*64   -> 4*4*128
        self.fc1 = nn.Linear(in_features = 4*4*conv_dim*4, out_features = 1, bias=True)
        

    def forward(self, x):
        # complete forward function
        out = F.leaky_relu(self.cv1(x), 0.2)
        out = F.leaky_relu(self.cv2(x), 0.2)
        out = F.leaky_relu(self.cv3(x), 0.2)
        out = out.view(-1, 4*4*conv_dim*4)
        out = self.fc1(out)
        return out    

D = Discriminator(conv_dim)

class Generator(nn.Module):    
    def __init__(self, z_size, conv_dim=32):
        super(Generator, self).__init__()
        self.conv_dim = conv_dim
        self.z_size = z_size
        # complete init function
        self.fc1 = nn.Linear(in_features = z_size, out_features = 4*4*conv_dim*4)
        self.dc1 = deconv(in_channels = conv_dim*4, out_channels = conv_dim*2, kernel_size=4, stride=2, padding=1, batch_norm=True)
        self.dc2 = deconv(in_channels = conv_dim*2, out_channels = conv_dim, kernel_size=4, stride=2, padding=1, batch_norm=True)
        self.dc3 = deconv(in_channels = conv_dim, out_channels = 3, kernel_size=4, stride=2, padding=1, batch_norm=False)

    def forward(self, x):
        # complete forward function
        x = self.fc1(x)
        x = x.view(-1, conv_dim*4, 4, 4)
        x = F.relu(self.dc1(x))
        x = F.relu(self.dc2(x))
        x = F.tanh(self.dc3(x))
        return x

G = Generator(z_size=z_size, conv_dim=conv_dim)

原文

On 18th May 2022, PyTorch announced support for GPU-accelerated PyTorch training on Mac.

I followed the following process to set up PyTorch on my Macbook Air M1 (using miniconda).

conda create -n torch-nightly python=3.8 

$ conda activate torch-nightly

$ pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

I am trying to execute a script from Udacity's Deep Learning Course available here.

The script moves the models to GPU using the following code:

G.cuda()
D.cuda()

However, this will not work on M1 chips, since there is no CUDA.

If we want to move models to M1 GPU and our tensors to M1 GPU, and train entirely on M1 GPU, what should we be doing?

If Relevant: G and D are Discriminator and Generators for GAN's.

class Discriminator(nn.Module):

    def __init__(self, conv_dim=32):
        super(Discriminator, self).__init__()
        self.conv_dim = conv_dim
        # complete init function
        self.cv1 = conv(in_channels=3, out_channels=conv_dim, kernel_size=4, stride=2, padding=1, batch_norm=False)            # 32*32*3  -> 16*16*32
        self.cv2 = conv(in_channels=conv_dim, out_channels=conv_dim*2, kernel_size=4, stride=2, padding=1, batch_norm=True)    # 16*16*32 -> 8*8*64
        self.cv3 = conv(in_channels=conv_dim*2, out_channels=conv_dim*4, kernel_size=4, stride=2, padding=1, batch_norm=True)  # 8*8*64   -> 4*4*128
        self.fc1 = nn.Linear(in_features = 4*4*conv_dim*4, out_features = 1, bias=True)
        

    def forward(self, x):
        # complete forward function
        out = F.leaky_relu(self.cv1(x), 0.2)
        out = F.leaky_relu(self.cv2(x), 0.2)
        out = F.leaky_relu(self.cv3(x), 0.2)
        out = out.view(-1, 4*4*conv_dim*4)
        out = self.fc1(out)
        return out    

D = Discriminator(conv_dim)

class Generator(nn.Module):    
    def __init__(self, z_size, conv_dim=32):
        super(Generator, self).__init__()
        self.conv_dim = conv_dim
        self.z_size = z_size
        # complete init function
        self.fc1 = nn.Linear(in_features = z_size, out_features = 4*4*conv_dim*4)
        self.dc1 = deconv(in_channels = conv_dim*4, out_channels = conv_dim*2, kernel_size=4, stride=2, padding=1, batch_norm=True)
        self.dc2 = deconv(in_channels = conv_dim*2, out_channels = conv_dim, kernel_size=4, stride=2, padding=1, batch_norm=True)
        self.dc3 = deconv(in_channels = conv_dim, out_channels = 3, kernel_size=4, stride=2, padding=1, batch_norm=False)

    def forward(self, x):
        # complete forward function
        x = self.fc1(x)
        x = x.view(-1, conv_dim*4, 4, 4)
        x = F.relu(self.dc1(x))
        x = F.relu(self.dc2(x))
        x = F.tanh(self.dc3(x))
        return x

G = Generator(z_size=z_size, conv_dim=conv_dim)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不一样的天空 2025-02-09 10:58:32

这就是我使用的：

if torch.backends.mps.is_available():
    mps_device = torch.device("mps")
    G.to(mps_device)
    D.to(mps_device)

类似地，对于我想搬到M1 GPU的所有张量，我使用了：

tensor_ = tensor_(mps_device)

某些操作尚未使用MPS实施，我们可能需要设置一些环境变量来使用CPU倒退：
我执行脚本时遇到的一个错误是

# NotImplementedError: The operator 'aten::_slow_conv2d_forward' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

求解它，我设置了环境变量pytorch_enable_mps_fallback = 1

conda env config vars set PYTORCH_ENABLE_MPS_FALLBACK=1
conda activate <test-env>

参考：

https://pytorch.org/blog/blog/introducing-accelerated-pytorch-training-training-training-on-mac/
。
：//sebastianraschka.com/blog/2022/pytorch-m1-gpu.html“ rel =“ noreferrer”> https://sebastianraschka.com/blog/2022222222222222222222222222222222222222222222222222222222UN/PYTORCH-M1-M1-GPU.HTML
href =“ https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#settml#envinting-environment-variariables” rel =“ noreferrer”> https：// docs .conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#setting-environment-variables

This is what I used:

if torch.backends.mps.is_available():
    mps_device = torch.device("mps")
    G.to(mps_device)
    D.to(mps_device)

Similarly for all tensors that I want to move to M1 GPU, I used:

tensor_ = tensor_(mps_device)

Some operations are ot yet implemented using MPS, and we might need to set a few environment variables to use CPU fall back instead:
One error that I faced during executing the script was

# NotImplementedError: The operator 'aten::_slow_conv2d_forward' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

To solve it I set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1

conda env config vars set PYTORCH_ENABLE_MPS_FALLBACK=1
conda activate <test-env>

References:

回复收藏 0 原文

诗笺 2025-02-09 10:58:32

我想通过指定在安装MPS构建时确保使用本机Python ARM64版本（3.9.x）来添加上述答案。如果您在Conda上，

import platform
print(platform.platform())

请检查是否使用X86或ARM64。我遇到的两个错误是：

RuntimeError: Expected one of cpu, cuda, xpu, mkldnn, opengl, opencl, ideep, hip, ve, ort, mlc, xla, lazy, vulkan, meta, hpu device type at start of device string: mps` and `AttributeError: module 'torch.backends' has no attribute 'mps'

这是因为即使我安装了所需的pytorch版本，我仍在运行Python X86。

，请

这些问题
解决
要

：极其新的越野车。希望很快会好起来。

I'd like to add to the answer above by specifying that we should make sure we're using the native Python arm64 version (3.9.x) for M1 while installing the mps build. If you're on conda do:

import platform
print(platform.platform())

to check whether x86 or arm64 is being used. The two errors I encountered were:

RuntimeError: Expected one of cpu, cuda, xpu, mkldnn, opengl, opencl, ideep, hip, ve, ort, mlc, xla, lazy, vulkan, meta, hpu device type at start of device string: mps` and `AttributeError: module 'torch.backends' has no attribute 'mps'

This is because even though I had installed the required Pytorch versions, I was still running Python x86.

To fix these, do: