CNN:为什么我们首先将图像大小调整为 256,然后将裁剪中心裁剪为 224?
Alexnet 图像输入的转换如下:
transforms.Resize(256),
transforms.CenterCrop(224),
为什么我们首先将图像大小调整为 256,然后将裁剪中心裁剪为 224?我知道 ImageNet 的默认图像大小是 224x224,但是为什么我们不能直接将图像大小调整为 224x224?
The transformation for Alexnet image input is below:
transforms.Resize(256),
transforms.CenterCrop(224),
Why do we first resize the image to 256 and then center crop to 224? I know that 224x224 is the default image size of ImageNet but why we can't directly resize the image to 224x224?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
也许这是最好的视觉说明。考虑以下图像 (128x128px):
假设我们直接将其大小调整为 16x16px,最终会得到:
但如果我们先将其大小调整为 24x24px,
,然后将其裁剪为 16x16px,它看起来像这样:
如你所见,它去掉了边框,同时保留了中心的细节。请并排注意差异:
这同样适用于 224px 与 256px,只不过分辨率更高。
Perhaps this is best illustrated visually. Consider the following image (128x128px):
Say we would resize it to 16x16px directly, we'd end up with:
But if we'd resize it to 24x24px first,
and then crop it to 16x16px, it would look like this:
As you see, it's getting rid of the border, while retains details in the center. Note the differences side by side:
The same applies to 224px vs 256px, except this is at a larger resolution.
当使用单个参数 size 调用时,调整大小变换的输出受输入图像的宽高比影响 - 新尺寸为 [size x height / width, size],不是[尺寸,尺寸]。我认为单独运行这两个步骤是解决此问题并确保输入数据大小一致等于 [224, 224] 的一种方法。
我认为,对于预测,你可以直接调用 Resize(224, 224) 来调整大小,最终的效果是一样的。当然,边框不会被修剪,但我不明白为什么这一步对于任何自定义数据都很重要。
调整大小文档链接: https://pytorch.org/vision/stable /生成/torchvision.transforms.Resize.html
The output of resize transformation is affected by the aspect ratio of input images when called with a single argument size - the new dimensions are [size x height / width, size], not [size, size]. I think that running this two steps separately is a way to work around this issue and ensure consistent sizes of input data equal to [224, 224].
In my opinion, for prediction you can resize directly by calling Resize(224, 224) and the final effect will be the same. Of course the borders will not be trimmed, but I don't understand why this step would matter for any custom data.
Link to Resize documentation: https://pytorch.org/vision/stable/generated/torchvision.transforms.Resize.html