pytorch中图像的梯度 - 用于wgan中的梯度惩罚计算

发布于 2025-01-18 00:58:17 字数 2317 浏览 0 评论 0原文

我正在关注这个Github Repo 用于 WGAN 实施梯度惩罚。

我试图理解以下方法,它负责对梯度惩罚计算进行单元测试。

def test_gradient_penalty(image_shape):
    bad_gradient = torch.zeros(*image_shape)
    bad_gradient_penalty = gradient_penalty(bad_gradient)
    assert torch.isclose(bad_gradient_penalty, torch.tensor(1.))

    image_size = torch.prod(torch.Tensor(image_shape[1:]))
    good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)
    good_gradient_penalty = gradient_penalty(good_gradient)
    assert torch.isclose(good_gradient_penalty, torch.tensor(0.))

    random_gradient = test_get_gradient(image_shape)
    random_gradient_penalty = gradient_penalty(random_gradient)
    assert torch.abs(random_gradient_penalty - 1) < 0.1

# Now pass tuple argument for image dimenstion of 
# (batch_size, channel, height, width)
test_gradient_penalty((256, 1, 28, 28))

我不明白下面的行

good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)

在上面的 torch.ones(*image_shape) 只是填充 4-D 张量 填充 1 然后 torch.sqrt(image_size) 只是代表 tensor(28.) 的值

所以,我试图理解为什么我需要将 4-D 张量除以tensor(28.) 获取 good_gradient

如果我打印 bad_gradient,它将是一个 4-D 张量,如下所示

tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]],

          ---
          ---

如果我打印good_gradient,输出将为

tensor([[[[0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          ...,
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357]]],

          ---
          ---

I am following this Github Repo for the WGAN implementation with Gradient Penalty.

And I am trying to understand the following method, which does the job of unit-testing the gradient-penalty calulations.

def test_gradient_penalty(image_shape):
    bad_gradient = torch.zeros(*image_shape)
    bad_gradient_penalty = gradient_penalty(bad_gradient)
    assert torch.isclose(bad_gradient_penalty, torch.tensor(1.))

    image_size = torch.prod(torch.Tensor(image_shape[1:]))
    good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)
    good_gradient_penalty = gradient_penalty(good_gradient)
    assert torch.isclose(good_gradient_penalty, torch.tensor(0.))

    random_gradient = test_get_gradient(image_shape)
    random_gradient_penalty = gradient_penalty(random_gradient)
    assert torch.abs(random_gradient_penalty - 1) < 0.1

# Now pass tuple argument for image dimenstion of 
# (batch_size, channel, height, width)
test_gradient_penalty((256, 1, 28, 28))

I don't understand the below line

good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)

In above the torch.ones(*image_shape) is just filling a 4-D Tensor filled up with 1 and then
torch.sqrt(image_size) is just representing the value of tensor(28.)

So, what I am trying to understand why I need to divide the 4-D Tensor by tensor(28.) to get the good_gradient

If I print bad_gradient, it will be a 4-D Tensor as below

tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]],

          ---
          ---

If I print good_gradient, the output will be

tensor([[[[0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          ...,
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357],
          [0.0357, 0.0357, 0.0357,  ..., 0.0357, 0.0357, 0.0357]]],

          ---
          ---

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

清引 2025-01-25 00:58:17

对于行

good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)

首先,请注意 WGAN 中的梯度惩罚项是 =>

(norm(gradient(interpolated)) - 1)^2

对于理想梯度(即良好的梯度),此惩罚项将为 0。即良好的梯度是其gradient_penalty 为尽可能接近 0

这意味着在考虑梯度的 L2-范数后,应满足以下条件

(norm(gradient(x')) -1)^2 = 0

ie norm(gradient(x')) = 1

ie sqrt(Sum(gradient_i^2) ) = 1

现在如果你继续简化上面的(考虑如何计算 norm,请参阅下面的注释)数学表达式,您最终会得到

good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)

由于您将 image_shape 传递为 (256, 1, 28, 28) - 所以 torch.sqrt(image_size) 在您的case 是tensor(28.)

实际上,上面的代码行使用缩放器来划分 4-D 张量的每个元素,如 [[[[1., 1. ... ]]]] tensor(28.)


另外,请注意如何计算 norm

在此处输入图像描述

torch.norm 无需额外参数即可执行,即所谓的 Frobenius 范数有效地将矩阵重塑为一个长向量并返回其 2-范数。

给定一个 M * N 矩阵,矩阵的 Frobenius 范数定义为矩阵元素平方和的平方根。

Input: mat[][] = [[1, 2], [3, 4]] 
Output: 5.47723 
sqrt(1^2 + 2^2 + 3^2 + 4^2) = sqrt(30) = 5.47723

For the line

good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)

First, note the Gradient Penalty term in WGAN is =>

(norm(gradient(interpolated)) - 1)^2

And for the Ideal Gradient (i.e. a Good Gradient), this Penalty term would be 0. i.e. A Good gradient is one which has its gradient_penalty is as close to 0 as possible

This means the following should satisfy, after considering the L2-Norm of the Gradient

(norm(gradient(x')) -1)^2 = 0

i.e norm(gradient(x')) = 1

i.e. sqrt(Sum(gradient_i^2) ) = 1

Now if you just continue simplifying the above (considering how norm is calculated, see my note below) math expression, you will end up with

good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)

Since you are passing the image_shape as (256, 1, 28, 28) - so torch.sqrt(image_size) in your case is tensor(28.)

Effectively the above line is dividing each element of A 4-D Tensor like [[[[1., 1. ... ]]]] with a scaler tensor(28.)


Separately, note how norm is calculated

enter image description here

torch.norm without extra arguments performs, what is called a Frobenius norm which is effectively reshaping the matrix into one long vector and returning the 2-norm of that.

Given an M * N matrix, The Frobenius Norm of a matrix is defined as the square root of the sum of the squares of the elements of the matrix.

Input: mat[][] = [[1, 2], [3, 4]] 
Output: 5.47723 
sqrt(1^2 + 2^2 + 3^2 + 4^2) = sqrt(30) = 5.47723

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文