关于Mobilenet的内存使用情况

发布于 2025-01-12 14:16:50 字数 3299 浏览 2 评论 0原文

我正在使用 Pytorch 构建 MobileNetV1,每次训练模型时我的内存都会耗尽。 (pytorch 日志“被杀死!”然后突然崩溃)。
这是我的代码

配置文件:(yaml)

n_gpu: 0

arch: 
    type: MobileNet
    args: 
        in_channels: 3
        num_classes: 26
    
data_loader: 
    type: BallDataLoader
    args:
        data_dir: data/balls/
        batch_size: 64
        shuffle: true
        validation_split: 0.2
        num_workers: 0
        resize: 
        - 224
        - 224
    
optimizer:
    type: Adam
    args:
        lr: 1.0e-2
        weight_decay: 0
        amsgrad: true
    
loss: nll_loss
metrics: 
    - accuracy
    - top_k_acc

lr_scheduler: 
    type: StepLR
    args: 
        step_size: 50
        gamma: 0.1
    

trainer: 
    epochs: 50
    save_dir: saved/
    save_period: 2
    verbosity: 2
    monitor: min val_loss
    early_stop: 10
    tensorboard: true

模块.py:

class DepthwiseSeparableConv(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size = 3, stride = 1, padding = None):
        super().__init__()
        if padding == None:
            padding = kernel_size // 2
        self.depth_wise_conv = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, groups= in_channels)
        self.bn1 = nn.BatchNorm2d(in_channels)
        self.point_wise_conv = nn.Conv2d(in_channels, out_channels, (1,1), 1, 0)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.in_channels = in_channels
        self.out_channels = out_channels

    def forward(self, x):
        x = self.depth_wise_conv(x)
        x = self.bn1(x)
        x = F.relu(x)
        x = self.point_wise_conv(x)
        x = self.bn2(x)
        x = F.relu(x)
        return x

模型.py

class MobileNet(ImageNet):
    def __init__(self, in_channels = 3, num_classes = 1000):
        super().__init__()
        
        self.convs = nn.Sequential(
            nn.Conv2d(in_channels, 32, kernel_size= 3, padding= 1, stride = 1 ),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace = True),
            DepthwiseSeparableConv(32, 64),
            DepthwiseSeparableConv(64, 128, stride = 2),
            DepthwiseSeparableConv(128, 128),
            DepthwiseSeparableConv(128, 256),
            DepthwiseSeparableConv(256, 256),
            DepthwiseSeparableConv(256, 512, stride = 2),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 1024, stride = 1),
            DepthwiseSeparableConv(1024, 1024, stride= 2),
            nn.AdaptiveAvgPool2d(1)
        )
        self.fc = nn.Linear(1024, num_classes)

    def forward(self, x):
        x = self.convs(x)
        x = x.view(-1, 1024)
        x = self.fc(x)
        x = F.log_softmax(x, dim = 1)
        return x

所以我从 https://github.com/jmjeon94/MobileNet-Pytorch,并且成功了。几个小时后,我仍然无法找出为什么会发生这种情况,因为模型几乎相同,而且由于移动网络的架构师很轻,我想这应该不会占用太多空间来运行。这是否有可能是因为 python 解释器或者我的代码实际上有问题?

I'm building MobileNetV1 with Pytorch and had my memory ran out every time I train the model. (The pytorch log "Killed!" and suddenly crashed).
This is my code

Config file: (yaml)

n_gpu: 0

arch: 
    type: MobileNet
    args: 
        in_channels: 3
        num_classes: 26
    
data_loader: 
    type: BallDataLoader
    args:
        data_dir: data/balls/
        batch_size: 64
        shuffle: true
        validation_split: 0.2
        num_workers: 0
        resize: 
        - 224
        - 224
    
optimizer:
    type: Adam
    args:
        lr: 1.0e-2
        weight_decay: 0
        amsgrad: true
    
loss: nll_loss
metrics: 
    - accuracy
    - top_k_acc

lr_scheduler: 
    type: StepLR
    args: 
        step_size: 50
        gamma: 0.1
    

trainer: 
    epochs: 50
    save_dir: saved/
    save_period: 2
    verbosity: 2
    monitor: min val_loss
    early_stop: 10
    tensorboard: true

modules.py:

class DepthwiseSeparableConv(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size = 3, stride = 1, padding = None):
        super().__init__()
        if padding == None:
            padding = kernel_size // 2
        self.depth_wise_conv = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, groups= in_channels)
        self.bn1 = nn.BatchNorm2d(in_channels)
        self.point_wise_conv = nn.Conv2d(in_channels, out_channels, (1,1), 1, 0)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.in_channels = in_channels
        self.out_channels = out_channels

    def forward(self, x):
        x = self.depth_wise_conv(x)
        x = self.bn1(x)
        x = F.relu(x)
        x = self.point_wise_conv(x)
        x = self.bn2(x)
        x = F.relu(x)
        return x

model.py

class MobileNet(ImageNet):
    def __init__(self, in_channels = 3, num_classes = 1000):
        super().__init__()
        
        self.convs = nn.Sequential(
            nn.Conv2d(in_channels, 32, kernel_size= 3, padding= 1, stride = 1 ),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace = True),
            DepthwiseSeparableConv(32, 64),
            DepthwiseSeparableConv(64, 128, stride = 2),
            DepthwiseSeparableConv(128, 128),
            DepthwiseSeparableConv(128, 256),
            DepthwiseSeparableConv(256, 256),
            DepthwiseSeparableConv(256, 512, stride = 2),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 512),
            DepthwiseSeparableConv(512, 1024, stride = 1),
            DepthwiseSeparableConv(1024, 1024, stride= 2),
            nn.AdaptiveAvgPool2d(1)
        )
        self.fc = nn.Linear(1024, num_classes)

    def forward(self, x):
        x = self.convs(x)
        x = x.view(-1, 1024)
        x = self.fc(x)
        x = F.log_softmax(x, dim = 1)
        return x

So I found a model from https://github.com/jmjeon94/MobileNet-Pytorch, and it worked. After hours I still can't find out why this happened as the models are nearly identical, and since the architect of mobilenet is farely light, this shouldn't take much space to run I supposed. Is there any chance that this is because of the python interpreter or there are actually something wrong with my code?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

杯别 2025-01-19 14:16:50

我认为这是因为你的批量大小。尝试使用较小的批量大小,例如 32,16,8,4,2。

I think it's because of your batch size. Try using smaller batch size like 32,16,8,4,2.

若无相欠,怎会相见 2025-01-19 14:16:50

我删除了 nn.Conv2d(in_channels, 32, kernel_size= 3, padding= 1, stride = 1 ) 行并重写了相同的代码并运行了代码。仍然不知道为什么,但似乎是解释器或文本编辑器导致了错误。感谢您的出席。
特别感谢 @Anmol Narang 先生的努力,我非常感激。

I delete the line nn.Conv2d(in_channels, 32, kernel_size= 3, padding= 1, stride = 1 ) and rewrite the same and the code ran. Still don't know why but it seem to be the interpreter or the text editor which cause the error. Thank you for attending.
And special thanks to Mr. @Anmol Narang for your efford, I'm very appreciated.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文