Pytorch Lightning(可训练的参数 - 错误)

发布于 2025-01-22 05:52:20 字数 1671 浏览 0 评论 0 原文

我正在使用Pytorch Lightning进行多GPU培训。下面的输出显示模型:

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]
┏━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃    ┃ Name       ┃ Type              ┃ Params ┃
┡━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ 0  │ encoder    │ Encoder           │  2.0 M │
│ 1  │ classifier │ Sequential        │  8.8 K │
│ 2  │ criterion  │ BCEWithLogitsLoss │      0 │
│ 3  │ train_acc  │ Accuracy          │      0 │
│ 4  │ val_acc    │ Accuracy          │      0 │
│ 5  │ train_auc  │ AUROC             │      0 │
│ 6  │ val_auc    │ AUROC             │      0 │
│ 7  │ train_f1   │ F1Score           │      0 │
│ 8  │ val_f1     │ F1Score           │      0 │
│ 9  │ train_mcc  │ MatthewsCorrCoef  │      0 │
│ 10 │ val_mcc    │ MatthewsCorrCoef  │      0 │
│ 11 │ train_sens │ Recall            │      0 │
│ 12 │ val_sens   │ Recall            │      0 │
│ 13 │ train_spec │ Specificity       │      0 │
│ 14 │ val_spec   │ Specificity       │      0 │
└────┴────────────┴───────────────────┴────────┘
Trainable params: 2.0 M
Non-trainable params: 0

我将编码器设置为无法使用以下代码来实现:

ckpt = torch.load(chk_path)
self.encoder.load_state_dict(ckpt['state_dict'])
self.encoder.requires_grad = False

不应可训练的参数 be 8.8 K 而不是 2.0 m

我的优化器如下:

optimizer =  torch.optim.RMSprop(filter(lambda p: p.requires_grad, self.parameters()), lr =self.lr, weight_decay = self.weight_decay)

I am employing MULTI-GPU training using pytorch lightning. The below output displays the model:

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]
┏━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃    ┃ Name       ┃ Type              ┃ Params ┃
┡━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ 0  │ encoder    │ Encoder           │  2.0 M │
│ 1  │ classifier │ Sequential        │  8.8 K │
│ 2  │ criterion  │ BCEWithLogitsLoss │      0 │
│ 3  │ train_acc  │ Accuracy          │      0 │
│ 4  │ val_acc    │ Accuracy          │      0 │
│ 5  │ train_auc  │ AUROC             │      0 │
│ 6  │ val_auc    │ AUROC             │      0 │
│ 7  │ train_f1   │ F1Score           │      0 │
│ 8  │ val_f1     │ F1Score           │      0 │
│ 9  │ train_mcc  │ MatthewsCorrCoef  │      0 │
│ 10 │ val_mcc    │ MatthewsCorrCoef  │      0 │
│ 11 │ train_sens │ Recall            │      0 │
│ 12 │ val_sens   │ Recall            │      0 │
│ 13 │ train_spec │ Specificity       │      0 │
│ 14 │ val_spec   │ Specificity       │      0 │
└────┴────────────┴───────────────────┴────────┘
Trainable params: 2.0 M
Non-trainable params: 0

I have set Encoder to be untrainable using the below code:

ckpt = torch.load(chk_path)
self.encoder.load_state_dict(ckpt['state_dict'])
self.encoder.requires_grad = False

Shouldn't trainable params be 8.8 K rather than 2.0 M ?

My optimizer is the following:

optimizer =  torch.optim.RMSprop(filter(lambda p: p.requires_grad, self.parameters()), lr =self.lr, weight_decay = self.weight_decay)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

抹茶夏天i‖ 2025-01-29 05:52:20

self.encoder.requires_grad = false 什么都不做;实际上,火炬模块没有需要 flag。

第二个下划线),该方法将为本模块的所有参数设置需要

self.encoder.requires_grad_(False)

您应该做的是使用 oneques_grad _ 方法(请注意 : _

self.encoder.requires_grad = False doesn't do anything; in fact, torch Modules don't have a requires_grad flag.

What you should do instead is use the requires_grad_ method (note the second underscore), that will set requires_grad for all the parameters of this module to the desired value:

self.encoder.requires_grad_(False)

as described here: https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.requires_grad_

岁月流歌 2025-01-29 05:52:20

您需要设置 onegres_grad = false for Angoder参数一对一:

for param in self.encoder.parameters():
    param.requires_grad = False

You need to set requires_grad=False for all encoder parameters one-by-one:

for param in self.encoder.parameters():
    param.requires_grad = False
贵在坚持 2025-01-29 05:52:20

请注意,如果您执行以下代码:

class MNISTModel(LightningModule):
    def __init__(self):
        super().__init__()
        self.l1 = torch.nn.Linear(28 * 28, 10)

    def forward(self, x):
        return torch.relu(self.l1(x.view(x.size(0), -1)))

    def training_step(self, batch, batch_nb):
        x, y = batch
        loss = F.cross_entropy(self(x), y)
        return loss

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=0.02)

mnist_model = MNISTModel()
mnist_model.l2.requires_grad = False
print(mnist_model.l2.weight.requires_grad)
print(mnist_model.l2.bias.requires_grad)
ModelSummary(mnist_model) 

您将获得:

True
True

  | Name | Type   | Params
--------------------------------
0 | l1   | Linear | 1.2 M 
1 | l2   | Linear | 2.5 M 
2 | l3   | Linear | 15.7 K
--------------------------------
3.7 M     Trainable params
0         Non-trainable params
3.7 M     Total params
14.827    Total estimated model params size (MB)

这实际上不是该层中参数的停用需要 requiens_grad 。因此,您有两个选择,请根据(

mnist_model = MNISTModel()
mnist_model.l2.requires_grad_(False)
ModelSummary(mnist_model)
  | Name | Type   | Params
--------------------------------
0 | l1   | Linear | 1.2 M 
1 | l2   | Linear | 2.5 M 
2 | l3   | Linear | 15.7 K
--------------------------------
1.2 M     Trainable params
2.5 M     Non-trainable params
3.7 M     Total params
14.827    Total estimated model params size (MB)
  1. ​通过模块中的参数,
mnist_model = MNISTModel()
for param in mnist_model.l2.parameters():
    param.requires_grad = False

ModelSummary(mnist_model)

您需要看到:

  | Name | Type   | Params
--------------------------------
0 | l1   | Linear | 1.2 M 
1 | l2   | Linear | 2.5 M 
2 | l3   | Linear | 15.7 K
--------------------------------
1.2 M     Trainable params
2.5 M     Non-trainable params
3.7 M     Total params
14.827    Total estimated model params size (MB)

您需要设置 requales_grad <代码> false false for所需的特定层中的所有参数要停用

Notice that if you execute the following piece of code:

class MNISTModel(LightningModule):
    def __init__(self):
        super().__init__()
        self.l1 = torch.nn.Linear(28 * 28, 10)

    def forward(self, x):
        return torch.relu(self.l1(x.view(x.size(0), -1)))

    def training_step(self, batch, batch_nb):
        x, y = batch
        loss = F.cross_entropy(self(x), y)
        return loss

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=0.02)

mnist_model = MNISTModel()
mnist_model.l2.requires_grad = False
print(mnist_model.l2.weight.requires_grad)
print(mnist_model.l2.bias.requires_grad)
ModelSummary(mnist_model) 

You will get:

True
True

  | Name | Type   | Params
--------------------------------
0 | l1   | Linear | 1.2 M 
1 | l2   | Linear | 2.5 M 
2 | l3   | Linear | 15.7 K
--------------------------------
3.7 M     Trainable params
0         Non-trainable params
3.7 M     Total params
14.827    Total estimated model params size (MB)

which means that this is actually not deactivating requires_grad for the parameters in that layer. So, you have two option according to (https://pytorch.org/docs/stable/notes/autograd.html#setting-requires-grad)

  1. Applying .requires_grad_() to a module as suggested by @burzam (the more correct one)
mnist_model = MNISTModel()
mnist_model.l2.requires_grad_(False)
ModelSummary(mnist_model)
  | Name | Type   | Params
--------------------------------
0 | l1   | Linear | 1.2 M 
1 | l2   | Linear | 2.5 M 
2 | l3   | Linear | 15.7 K
--------------------------------
1.2 M     Trainable params
2.5 M     Non-trainable params
3.7 M     Total params
14.827    Total estimated model params size (MB)
  1. Loop through the parameters in the module
mnist_model = MNISTModel()
for param in mnist_model.l2.parameters():
    param.requires_grad = False

ModelSummary(mnist_model)

you will see:

  | Name | Type   | Params
--------------------------------
0 | l1   | Linear | 1.2 M 
1 | l2   | Linear | 2.5 M 
2 | l3   | Linear | 15.7 K
--------------------------------
1.2 M     Trainable params
2.5 M     Non-trainable params
3.7 M     Total params
14.827    Total estimated model params size (MB)

You need to to set requires_grad to False for all the parameters in the specific layers you want to deactivate

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文