如何仅需要pytorch张量中的某些张量元件?

发布于 2025-01-30 22:47:26 字数 346 浏览 2 评论 0原文

我喜欢使用只有几个可变元素的张量,这些元素在反向传播步骤中被考虑。例如:

self.conv1 = nn.Conv2d(3, 16, 3, 1, padding=1, bias=False)
mask = torch.zeros(self.conv1.weight.data.shape, requires_grad=False)
self.conv1.weight.data[0, 0, 0, 0] += mask[0, 0, 0, 0]
print(self.conv1.weight.data[0, 0 , 0, 0].requires_grad) 

它将输出 false

I like to use a tensor with only a few variable elements which are considered during the backpropagation step. Consider for example:

self.conv1 = nn.Conv2d(3, 16, 3, 1, padding=1, bias=False)
mask = torch.zeros(self.conv1.weight.data.shape, requires_grad=False)
self.conv1.weight.data[0, 0, 0, 0] += mask[0, 0, 0, 0]
print(self.conv1.weight.data[0, 0 , 0, 0].requires_grad) 

It will be output False

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

翻了热茶 2025-02-06 22:47:26

您只能在张量级别打开和关闭梯度计算,这意味着需要 不是 元素。您观察到的不同之处是因为您访问了requiees_grad conv1.weight.data的属性与其包装张量conv1.ewert !

请注意差异:

>>> conv1 = nn.Conv2d(3, 16, 3) # requires_grad=True by default

>>> conv1.weight.requires_grad
True

>>> conv1.weight.data.requires_grad
False

conv1.关键是重量张量,而conv1.weight.data是基础数据张量,它永远不需要梯度,因为它处于不同的水平。


现在,如何求解 部分需要在张量上进行梯度计算 。您可以将其视为“仅需要张量的某些元素” ,而是可以将其视为“不需要梯度梯度” 。您可以通过在向后的位置覆盖张量的梯度值来做到这一点:

>>> conv1 = nn.Conv2d(3, 1, 2)
>>> mask = torch.ones_like(conv1.weight)

例如,为了防止卷积层的第一个组件的更新:

>>> mask[0,0,0,0] = 0 

向后传递后,您可以在conv1.关键:

>>> conv1(torch.rand(1,3,10,10)).mean().backward()
>>> conv1.weight.grad *= mask

You can only switch on and off gradient computation at the tensor level which means that the requires_grad is not element-wise. What you observe is different because you have accessed the requires_grad attribute of conv1.weight.data which is not the same object as its wrapper tensor conv1.weight!

Notice the difference:

>>> conv1 = nn.Conv2d(3, 16, 3) # requires_grad=True by default

>>> conv1.weight.requires_grad
True

>>> conv1.weight.data.requires_grad
False

conv1.weight is the weight tensor while conv1.weight.data is the underlying data tensor which never requires gradient because it is at a different level.


Now onto how to solve the partially requiring gradient computation on a tensor. Instead of looking at solving it as "only require gradient for some elements of tensor", you can think of it as "don't require gradient for some elements of tensor". You can do so by overwriting the gradient values on the tensor at the desired positions after the backward pass:

>>> conv1 = nn.Conv2d(3, 1, 2)
>>> mask = torch.ones_like(conv1.weight)

For example, to prevent the update of the first component of the convolutional layer:

>>> mask[0,0,0,0] = 0 

After the backward pass, you can mask the gradient on conv1.weight:

>>> conv1(torch.rand(1,3,10,10)).mean().backward()
>>> conv1.weight.grad *= mask
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文