关于在 torch.no_grad() 中使用 .grad 的问题

发布于 2025-01-11 06:39:39 字数 1051 浏览 0 评论 0原文

我写了一些玩具示例来理解 torch.no_grad() 的工作原理:

# example 1
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
with torch.no_grad():
  a -= a.grad
  print(a.requires_grad)
# True

因此,with torch.no_grad() 内的 a -= a.grad 将保留 a.requires_grad = True

# example2
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
with torch.no_grad():
  a = a - a.grad
  print(a.requires_grad)
# False

但是,with torch.no_grad() 内的 a -= 1 将设置 a.requires_grad = False

# example3
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
a -= a.grad
print(a.requires_grad)
# RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

a -= a.grad 而没有<代码>与torch.no_grad() 会抛出 RuntimeError (但 a -= 1 不会)

我找不到上述结果的解释。有人可以指点方向吗? 非常感谢!

I wrote some toy examples to understand how torch.no_grad() works:

# example 1
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
with torch.no_grad():
  a -= a.grad
  print(a.requires_grad)
# True

So, a -= a.grad inside with torch.no_grad() will keep the a.requires_grad = True

# example2
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
with torch.no_grad():
  a = a - a.grad
  print(a.requires_grad)
# False

But, a -= 1 inside with torch.no_grad() will set a.requires_grad = False

# example3
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
a -= a.grad
print(a.requires_grad)
# RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

a -= a.grad without with torch.no_grad() will throw RuntimeError (but a -= 1 does not)

I could not find explanations for the above results. Could somebody point a direction ?
Many thanks !

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

少女净妖师 2025-01-18 06:39:39

我认为上面的答案没有明确解释这三种情况。

情况 1

a -= a.grad 是就地操作,因此它不会更改 a 的属性 require_grad。因此,a.require = True

情况 2

a = a - a.grad 不是就地操作,因此它在内存空间中创建一个新的张量对象。由于在无梯度模式下,a.require = False

情况3,

您无法在无梯度模式之外使用require grad = True更改叶张量。

请参阅讨论

我也尝试了下面的代码

import torch
a = torch.tensor([1.1], requires_grad=True)
b = a ** 2
b.backward()
a -= 1

,它确实抛出了相同的运行时错误!

希望我的回答对你有帮助

I thought the anwser above did not explain the 3 cases explicitly.

case 1

a -= a.grad is a in place operation so it will not change the attribute require_grad of a. Thus a.require = True

case 2

a = a - a.grad is not a in place operation so it create a new tensor object in memory space. Since in no grad mode, a.require = False

case 3

you can not change a leaf tensor with require grad = True outside no grad mode.

See discuss

I also try the code below

import torch
a = torch.tensor([1.1], requires_grad=True)
b = a ** 2
b.backward()
a -= 1

it do throw the same runtime error!

I hope my answer will help you

柏拉图鍀咏恒 2025-01-18 06:39:39

这个a.grad是None,并且在第一次调用backward()时变成一个Tensor。

然后,grad 属性将包含计算出的梯度,并且将来对 backward() 的调用会将梯度累积(添加)到其中。

如果您使用 with torch.no_grad(): 则 PyTorch autograd 引擎将被禁用,因此您不会遇到错误。

requires_grad = True 是记录张量梯度的指标。

您在上一个示例中遇到的错误告诉您不能使用 a.grad 进行就地操作,但您可以仅使用 a=aa.grad示例:

# example3
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
a = a -a.grad
print(a.grad)
print(a.requires_grad)

输出:

None
True

/usr/local/lib/python3.7/dist-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:417.)
  return self._grad

This a.grad is None and becomes a Tensor the first time a call to backward() is made.

The grad attribute will then contain the gradients computed and future calls to backward() will accumulate (add) gradients into it.

If you use with torch.no_grad(): then PyTorch autograd engine is disabled so you don't have the error.

requires_grad = True is an indicator to record gradients on a tensor.

The error you are getting in the last example is telling you you cannot use the a.grad for in-place operations but you may use a=a-a.grad just for the example:

# example3
import torch
a = torch.randn(10, 5, requires_grad = True)
z = a * 3
l = z - 0.5
l.sum().backward()
a = a -a.grad
print(a.grad)
print(a.requires_grad)

Out:

None
True

/usr/local/lib/python3.7/dist-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:417.)
  return self._grad

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文