如何在Pytorch中设置网络的梯度

发布于 2025-01-27 11:45:35 字数 460 浏览 2 评论 0原文

我在Pytorch中有一个模型。该模型可以采用任何形状，但假设这是

torch_model =  Sequential(
    Flatten(),
    Linear(28 * 28, 256),
    Dropout(.4),
    ReLU(),
    BatchNorm1d(256),
    ReLU(),
    Linear(256, 128),
    Dropout(.4),
    ReLU(),
    BatchNorm1d(128),
    ReLU(),
    Linear(128, 10),
    Softmax()
)

我正在使用SGD优化器的模型，我想为每个层设置梯度，以便SGD算法将沿我想要的方向移动参数。

可以说，我希望所有层的所有梯度都成为一个（torch.ons_like（gradient_shape））我该如何执行此操作？谢谢？

原文

I have a model in pytorch. The model can take any shape but lets assume this is the model

torch_model =  Sequential(
    Flatten(),
    Linear(28 * 28, 256),
    Dropout(.4),
    ReLU(),
    BatchNorm1d(256),
    ReLU(),
    Linear(256, 128),
    Dropout(.4),
    ReLU(),
    BatchNorm1d(128),
    ReLU(),
    Linear(128, 10),
    Softmax()
)

I am using SGD optimizer, I want to set the gradient for each of the layers so the SGD algorithm will move the parameters in the direction I want.

Lets say I want all the gradients for all the layers to be ones (torch.ones_like(gradient_shape)) how can I do this?
Thanks?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我不是你的备胎 2025-02-03 11:45:35

在Pytorch中，具有上面定义的模型，您可以在这样的层上迭代：

for layer in list(torch_model.modules())[1:]:
  print(layer)

您必须添加[1：]，因为返回的第一个模块是顺序模块本身。在任何一层中，您都可以使用layer.pufer.code访问权重。但是，重要的是要记住，某些层（例如扁平和辍学）没有权重。一种检查的方法，然后将1添加到每个重量的是：

for layer in list(torch_model.modules())[1:]:
  if hasattr(layer, 'weight'):
    with torch.no_grad():
      for i in range(layer.weight.shape[0]):
          layer.weight[i] = layer.weight[i] + 1

我在您的模型上测试了上述内容，并且确实为每个重量添加了1个。值得注意的是，没有torch.no_grad（），它将无法正常工作，因为您不希望Pytorch跟踪更改。

In PyTorch, with a model defined as yours above, you can iterate over the layers like this:

for layer in list(torch_model.modules())[1:]:
  print(layer)

You have to add the [1:] since the first module returned is the sequential module itself. In any layer, you can access the weights with layer.weight. However, it is important to remember that some layers, like Flatten and Dropout, don't have weights. A way to check, and then add 1 to each weight would be:

for layer in list(torch_model.modules())[1:]:
  if hasattr(layer, 'weight'):
    with torch.no_grad():
      for i in range(layer.weight.shape[0]):
          layer.weight[i] = layer.weight[i] + 1

I tested the above on your model and it does add 1 to every weight. Worth noting that it won't work without torch.no_grad() as you don't want pytorch tracking the changes.

回复收藏 0 原文

~没有更多了~