无法保留 require_grad=False 的张量上的 grad,尽管专门将其设置为 true
我正在尝试在 PyTorch 中创建一个 nn.module
,set_model_params
目前非常混乱,但我正在尝试将 requires_grad
设置为 true,所以我可以使用 retain_grad()
,但无论我将 requires_grad=True
放在哪里,它都会告诉我它是 False
:
class MyModule(nn.Module):
def __init__(self, depth):
super(MyModule, self).__init__()
self.linear1 = nn.Linear(28*28, 50)
self.linear3 = nn.Linear(50,10)
def forward(self, x):
x = x.view(-1, 28*28)
x = torch.tanh(self.linear1(x))
return self.linear3(x).view(-1)
def set_model_params(self, params_dict):
self.linear1.weight.data = params_dict["W1"]
self.linear1.weight.data.requires_grad=True
self.linear1.weight.data.retain_grad()
self.linear1.bias.data = params_dict["b1"]
self.linear1.bias.data.requires_grad=True
self.linear1.bias.data.retain_grad()
self.linear3.weight.data = params_dict["W3"]
self.linear3.weight.data.requires_grad = True
self.linear3.weight.data.retain_grad()
self.linear3.bias.data = params_dict["b3"]
self.linear3.bias.data.requires_grad = True
self.linear3.bias.data.retain_grad()
这是放因为神经网络参数的最大值和最小值会在网络后期发生变化。我还在这里设置了 requires_grad=True
。
def init_params_dict(d):
params_dict = {
"W1": torch.rand(50, 28 * 28, device="cuda", requires_grad=True) * (d * 2) - 1,
"b1": torch.zeros(50, device="cuda", requires_grad=True),
"W3": torch.rand(10, 50, device="cuda", requires_grad=True) * (d * 2) - 1,
"b3": torch.zeros(10, device="cuda", requires_grad=True),
}
for i in range(8):
params_dict['Wd' + str(i)] = torch.rand(50, 50, device="cuda", requires_grad=True) * (d * 2) - 1
params_dict['bd' + str(i)] = torch.zeros(50, device="cuda", requires_grad=True)
return params_dict
mymod = MyModule(8)
public_params = init_params_dict(0.01)
mymod.set_model_params(public_params)
尽管如此,我还是收到了这个错误。
<ipython-input-88-52eb353b30c6> in set_model_params(self, params_dict)
27 self.linear1.weight.data = params_dict["W1"]
28 self.linear1.weight.data.requires_grad=True
---> 29 self.linear1.weight.data.retain_grad()
30
31 self.linear1.bias.data = params_dict["b1"]
RuntimeError: can't retain_grad on Tensor that has requires_grad=False
如何在这些叶变量之一上设置 requires_grad=True
并使其保持不变?
I am attempting to create an nn.module
in PyTorch, the set_model_params
is currently very messy but I'm attempting to set requires_grad
to true so I am able to use retain_grad()
, but no matter where I'm putting requires_grad=True
, it's telling me it's False
:
class MyModule(nn.Module):
def __init__(self, depth):
super(MyModule, self).__init__()
self.linear1 = nn.Linear(28*28, 50)
self.linear3 = nn.Linear(50,10)
def forward(self, x):
x = x.view(-1, 28*28)
x = torch.tanh(self.linear1(x))
return self.linear3(x).view(-1)
def set_model_params(self, params_dict):
self.linear1.weight.data = params_dict["W1"]
self.linear1.weight.data.requires_grad=True
self.linear1.weight.data.retain_grad()
self.linear1.bias.data = params_dict["b1"]
self.linear1.bias.data.requires_grad=True
self.linear1.bias.data.retain_grad()
self.linear3.weight.data = params_dict["W3"]
self.linear3.weight.data.requires_grad = True
self.linear3.weight.data.retain_grad()
self.linear3.bias.data = params_dict["b3"]
self.linear3.bias.data.requires_grad = True
self.linear3.bias.data.retain_grad()
This is set because the maximum and minimum values of the neural network parameters will change later in the network. I also have requires_grad=True
set here.
def init_params_dict(d):
params_dict = {
"W1": torch.rand(50, 28 * 28, device="cuda", requires_grad=True) * (d * 2) - 1,
"b1": torch.zeros(50, device="cuda", requires_grad=True),
"W3": torch.rand(10, 50, device="cuda", requires_grad=True) * (d * 2) - 1,
"b3": torch.zeros(10, device="cuda", requires_grad=True),
}
for i in range(8):
params_dict['Wd' + str(i)] = torch.rand(50, 50, device="cuda", requires_grad=True) * (d * 2) - 1
params_dict['bd' + str(i)] = torch.zeros(50, device="cuda", requires_grad=True)
return params_dict
mymod = MyModule(8)
public_params = init_params_dict(0.01)
mymod.set_model_params(public_params)
Nonetheless, I get this error.
<ipython-input-88-52eb353b30c6> in set_model_params(self, params_dict)
27 self.linear1.weight.data = params_dict["W1"]
28 self.linear1.weight.data.requires_grad=True
---> 29 self.linear1.weight.data.retain_grad()
30
31 self.linear1.bias.data = params_dict["b1"]
RuntimeError: can't retain_grad on Tensor that has requires_grad=False
How can I set requires_grad=True
on one of these leaf variables and have it stick?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论