如何更新火炬的一部分。参数
用于更新TORCH.NN.参数定义的一部分参数。我已经测试了以下三种方法,但只有一种作品。
#(1)
import torch
class NET(torch.nn.Module):
def __init__(self):
super(NET, self).__init__()
self.params = torch.ones(4)
self.P = torch.nn.Parameter(torch.ones(1))
self.params[1] = self.P
def forward(self, x):
y = x * self.params
return y.sum()
net = NET()
x = torch.rand(4)
optim = torch.optim.Adam(net.parameters(), lr=0.001)
for _ in range(10):
optim.zero_grad()
loss = net(x)
loss.backward()
optim.step()
#runtimeerror:尝试第二次通过图向后
##(2)
import torch
class NET(torch.nn.Module):
def __init__(self):
super(NET, self).__init__()
self.P = torch.nn.Parameter(torch.ones(1))
def forward(self, x):
params = torch.ones(4)
params[1] = self.P
y = x * params
return y.sum()
net = NET()
x = torch.rand(4)
optim = torch.optim.Adam(net.parameters(), lr=0.001)
for _ in range(10):
optim.zero_grad()
loss = net(x)
loss.backward()
optim.step()
##它起作用,但是在每个正向中都需要创建和分配的操作。
#(3)
import torch
class NET(torch.nn.Module):
def __init__(self):
super(NET, self).__init__()
self.params = torch.nn.Parameter(torch.ones(4))
def forward(self, x):
y = x * self.params
return y.sum()
net = NET()
net.params[1].requires_grad = False
x = torch.rand(4)
optim = torch.optim.Adam(net.parameters(), lr=0.001)
for _ in range(10):
optim.zero_grad()
loss = net(x)
loss.backward()
optim.step()
#runtimeerror:您只能更改叶子变量的sunirtes_grad标志。
我想知道如何更新一部分以(1)和(3)方式的参数。
For updating a part of parameters defined by torch.nn.Parameter. I have tested the following three ways, but only one works.
#(1)
import torch
class NET(torch.nn.Module):
def __init__(self):
super(NET, self).__init__()
self.params = torch.ones(4)
self.P = torch.nn.Parameter(torch.ones(1))
self.params[1] = self.P
def forward(self, x):
y = x * self.params
return y.sum()
net = NET()
x = torch.rand(4)
optim = torch.optim.Adam(net.parameters(), lr=0.001)
for _ in range(10):
optim.zero_grad()
loss = net(x)
loss.backward()
optim.step()
# RuntimeError: Trying to backward through the graph a second time
#(2)
import torch
class NET(torch.nn.Module):
def __init__(self):
super(NET, self).__init__()
self.P = torch.nn.Parameter(torch.ones(1))
def forward(self, x):
params = torch.ones(4)
params[1] = self.P
y = x * params
return y.sum()
net = NET()
x = torch.rand(4)
optim = torch.optim.Adam(net.parameters(), lr=0.001)
for _ in range(10):
optim.zero_grad()
loss = net(x)
loss.backward()
optim.step()
# It works, but the operations of Create and Assign are needed in each forward.
#(3)
import torch
class NET(torch.nn.Module):
def __init__(self):
super(NET, self).__init__()
self.params = torch.nn.Parameter(torch.ones(4))
def forward(self, x):
y = x * self.params
return y.sum()
net = NET()
net.params[1].requires_grad = False
x = torch.rand(4)
optim = torch.optim.Adam(net.parameters(), lr=0.001)
for _ in range(10):
optim.zero_grad()
loss = net(x)
loss.backward()
optim.step()
# RuntimeError: you can only change requires_grad flags of leaf variables.
I wonder how to update a part of parameters in the ways (1) and (3).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
关于使用
需要
和 nn.参数:如果您必须冻结您的子模块
nn.module
,您需要您需要使用
requientes_grad _grad _grad _grad _
。但是,您不能在张量上部分梯度。a
nn.parameter
是一个包装器,允许给定的torch.tensor
be 注册nn.module
。默认情况下,包装张量将需要梯度计算。因此,您必须绝对将参数张量定义为:
而不是:
最终您应该使用 nn.module#参数 在将它们加载到优化器中之前。
您的第一个代码
#1
崩溃,因为您在同一棵树上执行多个反向流动,而无需明确设置retain_graph
totrue
。以下过程正常工作:您的第二个代码
#2
是正确的,因为您分配了需要梯度到其他张量的张量。最小的实现方法可以检查梯度是否确实是在p
上计算的,如下所示:您的第三代码
#3
无效,因为您需要在代码的一部分中计算梯度计算是不可能的:相反,另一种方法是在对参数上进行后退传播后掩盖梯度:
A small note on the use of
requires_grad
and nn.Parameter:If you had to freeze a sub-module of you
nn.Module
, you would require the use ofrequires_grad_
. However, you cannot partially require gradients on a tensor.A
nn.Parameter
is a wrapper which allows a giventorch.Tensor
to be registered inside ann.Module
. By default, the wrapped tensor will require gradient computation.You must therefore absolutely have your parameter tensor defined as:
And not as:
Ultimately you should check the content of your registered parameters with
nn.Module#parameters
before loading them inside an optimizer.Your first code
#1
crashes because you are performing multiple backpropagations on the same tree without explicitly setting theretain_graph
toTrue
. The following process works fine:Your second code
#2
is correct because you are assigning the tensor which requires gradient to a different tensor. A minimal implementation to check that the gradient is indeed computed onP
is as follows:Your third code
#3
is invalid because you are requiring gradient computation on part of the code which is not possible:An alternative way to do it instead is by masking the gradient after the back propagation has been done on the parameters: