Pytorch什么时候初始化参数?
我现在正在用 Pytorch 编写自己的网络。我想在我的网络中使用预训练的模型。这是我重写的 init() 代码:
class Generator(nn.Module):
def __init__(self) -> None:
super(Generator, self).__init__()
model_path = "somedir"
chekpoint = torch.load(model_path)
h_model = H_model()
h_model.load_state_dict(chekpoint['model'])
# 设置为测试模式
h_model.eval()
self.H_model = h_model
self.unet = UNet(enc_chs=(9,64,128,256,512), dec_chs=(512, 256, 128, 64), num_class=3, retain_dim=False, out_sz=(304, 304))
这里,h_model 是从检查点加载的,我已经对其进行了很好的训练。 我的问题是,初始化后,h_model中的参数是否会改变(预训练参数值是否被某些函数修改?)?为什么(我的意思是Pytorch在初始化参数时如何对待自定义层?Pytorch什么时候初始化参数?)
I’m now writing my own network with Pytorch. And I want to use a pretrained model in my net. Here is my overwriting init() code:
class Generator(nn.Module):
def __init__(self) -> None:
super(Generator, self).__init__()
model_path = "somedir"
chekpoint = torch.load(model_path)
h_model = H_model()
h_model.load_state_dict(chekpoint['model'])
# 设置为测试模式
h_model.eval()
self.H_model = h_model
self.unet = UNet(enc_chs=(9,64,128,256,512), dec_chs=(512, 256, 128, 64), num_class=3, retain_dim=False, out_sz=(304, 304))
Here, the h_model is loaded from checkpoint which I’ve trained it well.
My question is that after the initialization, will the parameter in h_model changed(Are the pretrained parameters vaule being modified by some function?)? And why(I mean how does Pytorch treat self-defined layer when it initializes parameters? And when does Pytorch initialize parameters?)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对于基本层(例如,nn.Conv、nn.Linear 等),参数由层的 __init__ 方法初始化.
例如,查看
的源代码class _ConvNd(Module)
(派生所有其他卷积层的类)。在其__init__
的底部,它调用self.reset_parameters()
来初始化权重。因此,如果您的 nn.Module 没有任何“独立”nn.Parameter ,则只有子 nn.Module 内的可训练参数,当您构建网络时,子模块的所有权重都会在构建子模块时初始化。
也就是说,一旦您调用
h_model = H_model()
,h_model
的权重就已经初始化为其默认值。调用h_model.load_state_dict(...)
将这些值覆盖为所需的预训练权重。For the basic layers (e.g.,
nn.Conv
,nn.Linear
, etc.) the parameters are initialized by the__init__
method of the layer.For example, look at the source code of
class _ConvNd(Module)
(the class from which all other convolution layers are derived). At the bottom of its__init__
it callsself.reset_parameters()
which initialize the weights.Therefore, if your
nn.Module
does not have any "independent"nn.Parameter
s, only trainable parameters inside sub-nn.Module
s, when you construct your network, all weights of the sub modules are being initialized as the sub modules are constructed.That is, once you call
h_model = H_model()
the weights ofh_model
are already initialized to their default values. Callingh_model.load_state_dict(...)
overrides these values to the desired pre-trained weights.