关于在 torch.no_grad() 中使用 .grad 的问题
我写了一些玩具示例来理解 torch.no_grad() 的工作原理:
# example 1
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
with torch.no_grad():
a -= a.grad
print(a.requires_grad)
# True
因此,with torch.no_grad()
内的 a -= a.grad
将保留 a.requires_grad = True
# example2
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
with torch.no_grad():
a = a - a.grad
print(a.requires_grad)
# False
但是,with torch.no_grad()
内的 a -= 1
将设置 a.requires_grad = False
# example3
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
a -= a.grad
print(a.requires_grad)
# RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
a -= a.grad
而没有<代码>与torch.no_grad() 会抛出 RuntimeError (但 a -= 1
不会)
我找不到上述结果的解释。有人可以指点方向吗? 非常感谢!
I wrote some toy examples to understand how torch.no_grad() works:
# example 1
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
with torch.no_grad():
a -= a.grad
print(a.requires_grad)
# True
So, a -= a.grad
inside with torch.no_grad()
will keep the a.requires_grad = True
# example2
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
with torch.no_grad():
a = a - a.grad
print(a.requires_grad)
# False
But, a -= 1
inside with torch.no_grad()
will set a.requires_grad = False
# example3
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
a -= a.grad
print(a.requires_grad)
# RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
a -= a.grad
without with torch.no_grad()
will throw RuntimeError (but a -= 1
does not)
I could not find explanations for the above results. Could somebody point a direction ?
Many thanks !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为上面的答案没有明确解释这三种情况。
情况 1
a -= a.grad
是就地操作,因此它不会更改 a 的属性require_grad
。因此,a.require = True
情况 2
a = a - a.grad
不是就地操作,因此它在内存空间中创建一个新的张量对象。由于在无梯度模式下,a.require = False
情况3,
您无法在
无梯度
模式之外使用require grad = True
更改叶张量。请参阅讨论
我也尝试了下面的代码
,它确实抛出了相同的运行时错误!
希望我的回答对你有帮助
I thought the anwser above did not explain the 3 cases explicitly.
case 1
a -= a.grad
is a in place operation so it will not change the attributerequire_grad
of a. Thusa.require = True
case 2
a = a - a.grad
is not a in place operation so it create a new tensor object in memory space. Since in no grad mode,a.require = False
case 3
you can not change a leaf tensor with
require grad = True
outsideno grad
mode.See discuss
I also try the code below
it do throw the same runtime error!
I hope my answer will help you
这个
a.grad
是None,并且在第一次调用backward()
时变成一个Tensor。然后,
grad
属性将包含计算出的梯度,并且将来对backward()
的调用会将梯度累积(添加)到其中。如果您使用
with torch.no_grad():
则 PyTorch autograd 引擎将被禁用,因此您不会遇到错误。requires_grad = True
是记录张量梯度的指标。您在上一个示例中遇到的错误告诉您不能使用
a.grad
进行就地操作,但您可以仅使用a=aa.grad
示例:输出:
This
a.grad
is None and becomes a Tensor the first time a call tobackward()
is made.The
grad
attribute will then contain the gradients computed and future calls tobackward()
will accumulate (add) gradients into it.If you use
with torch.no_grad():
then PyTorch autograd engine is disabled so you don't have the error.requires_grad = True
is an indicator to record gradients on a tensor.The error you are getting in the last example is telling you you cannot use the
a.grad
for in-place operations but you may usea=a-a.grad
just for the example:Out: