我如何使用Pytorch Lightning训练Regnet-800mf主链的最后几层

发布于 2025-02-10 03:46:00 字数 2496 浏览 3 评论 0 原文

我试图通过允许对以前冷冻的骨干(Regnet-800mf)进行一些最后一层的培训来获得更好的结果。如何在Pytorch Lightning中实现这一目标?我是ML的新手,所以如果我遗漏了任何重要信息,请原谅我。

我的模型(机械classifier)调用另一个类(参数classclassifier),该类别包括预训练的regnet作为冷冻骨架。在训练过程中,正向函数仅通过参数classifier的主链而不是分类层传递。我将在下面包括两者的初始功能。

我的 MechClassifier 模型:

class MechClassifier(pl.LightningModule):
    def __init__(
        self,
        num_classes,
        lr=4e-3,
        weight_decay=1e-8,
        gpus=1,
        max_epochs=30,
    ):
        super().__init__()
        self.lr = lr
        self.weight_decay = weight_decay
        self.__dict__.update(locals())
        
        self.backbone = ParametersClassifier.load_from_checkpoint(
            checkpoint_path="checkpoints/param_classifier/last.ckpt",
            num_classes=3,
            gpus=1,
        )
        
        self.backbone.freeze()
        self.backbone.eval()


        self.mf_classifier = nn.Sequential(
            nn.Linear(self.backbone.num_ftrs, 8),
            nn.ReLU(),
            nn.Linear(8, num_classes),
        )
        
        self.wd_classifier = nn.Sequential(
            nn.Linear(self.backbone.num_ftrs, 8),
            nn.ReLU(),
            nn.Linear(8, num_classes),
        )

    def forward(self, x):
        self.backbone.eval()
        with torch.no_grad():
            x = self.backbone.model(x)

        # x = self.model(x)

        out1 = self.mf_classifier(x)
        out2 = self.wd_classifier(x)

        # print(out1.size())
        return (out1, out2)

parametersclassifier (从检查点加载):

class ParametersClassifier(pl.LightningModule):
    def __init__(
        self,
        num_classes,
        lr=4e-3,
        weight_decay=0.05,
        gpus=1,
        max_epochs=30,
    ):
        super().__init__()
        self.lr = lr
        self.weight_decay = weight_decay
        self.__dict__.update(locals())

        self.model = models.regnet_y_800mf(pretrained=True)
        self.num_ftrs = self.model.fc.in_features
        self.model.fc = nn.Identity()
        self.fc1 = nn.Linear(self.num_ftrs, num_classes)
        self.fc2 = nn.Linear(self.num_ftrs, num_classes)
        self.fc3 = nn.Linear(self.num_ftrs, num_classes)
        self.fc4 = nn.Linear(self.num_ftrs, num_classes)

    def forward(self, x):
        x = self.model(x)
        out1 = self.fc1(x)
        out2 = self.fc2(x)
        out3 = self.fc3(x)
        out4 = self.fc4(x)
        return (out1, out2, out3, out4)

I am trying to get better results by allowing a few final layers of a previously frozen backbone (RegNet-800MF) to be trained. How can I implement this in PyTorch Lightning? I am very new to ML so please excuse me if I have left any important information out.

My model (MechClassifier) calls another class (ParametersClassifier) which includes the pre-trained RegNet as its frozen backbone. During training the forward function passes inputs only through the backbone of the ParametersClassifier and not the Classifying layers. I will include the init functions of both below.

My MechClassifier model:

class MechClassifier(pl.LightningModule):
    def __init__(
        self,
        num_classes,
        lr=4e-3,
        weight_decay=1e-8,
        gpus=1,
        max_epochs=30,
    ):
        super().__init__()
        self.lr = lr
        self.weight_decay = weight_decay
        self.__dict__.update(locals())
        
        self.backbone = ParametersClassifier.load_from_checkpoint(
            checkpoint_path="checkpoints/param_classifier/last.ckpt",
            num_classes=3,
            gpus=1,
        )
        
        self.backbone.freeze()
        self.backbone.eval()


        self.mf_classifier = nn.Sequential(
            nn.Linear(self.backbone.num_ftrs, 8),
            nn.ReLU(),
            nn.Linear(8, num_classes),
        )
        
        self.wd_classifier = nn.Sequential(
            nn.Linear(self.backbone.num_ftrs, 8),
            nn.ReLU(),
            nn.Linear(8, num_classes),
        )

    def forward(self, x):
        self.backbone.eval()
        with torch.no_grad():
            x = self.backbone.model(x)

        # x = self.model(x)

        out1 = self.mf_classifier(x)
        out2 = self.wd_classifier(x)

        # print(out1.size())
        return (out1, out2)

ParametersClassifier (loaded from checkpoint):

class ParametersClassifier(pl.LightningModule):
    def __init__(
        self,
        num_classes,
        lr=4e-3,
        weight_decay=0.05,
        gpus=1,
        max_epochs=30,
    ):
        super().__init__()
        self.lr = lr
        self.weight_decay = weight_decay
        self.__dict__.update(locals())

        self.model = models.regnet_y_800mf(pretrained=True)
        self.num_ftrs = self.model.fc.in_features
        self.model.fc = nn.Identity()
        self.fc1 = nn.Linear(self.num_ftrs, num_classes)
        self.fc2 = nn.Linear(self.num_ftrs, num_classes)
        self.fc3 = nn.Linear(self.num_ftrs, num_classes)
        self.fc4 = nn.Linear(self.num_ftrs, num_classes)

    def forward(self, x):
        x = self.model(x)
        out1 = self.fc1(x)
        out2 = self.fc2(x)
        out3 = self.fc3(x)
        out4 = self.fc4(x)
        return (out1, out2, out3, out4)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

岛徒 2025-02-17 03:46:00

您可以查看 regnet regnet 模型您正在使用 。它的 forward 函数:

def forward(self, x: Tensor) -> Tensor:
    x = self.stem(x)
    x = self.trunk_output(x)

    x = self.avgpool(x)
    x = x.flatten(start_dim=1)
    x = self.fc(x)

    return x

而不是像您一样使用 torch.no_grad 上下文管理器,而是应该按必要时打开/关闭 onegress_grad 。默认情况下,模块参数具有其需要 flag设置为 true ,这意味着他们能够执行梯度计算。如果此标志设置为 false ,则可以将这些组件视为冷冻。

根据您想要冻结的层以及想要捕获的层,您可以手动做到这一点。例如,如果要冻结骨架并捕获了Regnet的完全连接层,然后从 MechClassifier 'S __ INT __ INT __ INT __ INT __ 中替换以下内容:

self.backbone.freeze()
self.backbone.eval()

使用以下行:

## freeze all
self.backbone.model.requires_grad_(False)

## unfreeze last section of 4th block of backbone 
block4_section1 = getattr(self.backbone.model.trunk_output.block4, 'block4-1')
block4_section1.requires_grad_(True)

和使用 MechClassifier forward 函数这样的推断:

def forward(self, x):
    self.backbone.eval()
    x = self.backbone.model(x)
    out1 = self.mf_classifier(x)
    out2 = self.wd_classifier(x)
    return (out1, out2)

You can look at the implementation for the Regnet model you are using here. Its forward function:

def forward(self, x: Tensor) -> Tensor:
    x = self.stem(x)
    x = self.trunk_output(x)

    x = self.avgpool(x)
    x = x.flatten(start_dim=1)
    x = self.fc(x)

    return x

Instead of using a torch.no_grad context manager as you did, you should rather switch on/off the requires_grad as necessary. By default module parameters have their requires_grad flag set to True which means they are able to perform gradient computation. If this flag is set to False, you can consider those components as frozen.

Depending on which layers you want to freeze and those that you want to finetune, you can manually do that. For example, if you want to freeze the backbone and finetune the fully connected layer of the Regnet, and replace the following from MechClassifier's __init__:

self.backbone.freeze()
self.backbone.eval()

With the following lines:

## freeze all
self.backbone.model.requires_grad_(False)

## unfreeze last section of 4th block of backbone 
block4_section1 = getattr(self.backbone.model.trunk_output.block4, 'block4-1')
block4_section1.requires_grad_(True)

And perform inference on MechClassifier with a forward function like so:

def forward(self, x):
    self.backbone.eval()
    x = self.backbone.model(x)
    out1 = self.mf_classifier(x)
    out2 = self.wd_classifier(x)
    return (out1, out2)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文