调整大小后,在CIFAR10上施加重新连接(Pytorch)

发布于 2025-01-22 22:31:40 字数 876 浏览 0 评论 0原文

给定预先训练的RESNET152,试图使用一些常见数据集(使用Pytorch)计算预测基准标记,而想到的第一个RGB数据集则是CIFAR10。问题是CIFAR10数据是3x32x32和Resnet期望3x224x224。我已经使用已知的转换的方法调整了数据:

preprocess = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
train = datasets.CIFAR10(root='./data', train=True, download=True, transform=preprocess)
test = datasets.CIFAR10(root='./data', train=False, download=True, transform=preprocess)
train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size)
test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size)

但是这导致样本模糊和错误预测。我想知道在这些情况下,最好的方法是什么,因为我看到了许多使用这些数据集的论文给定的高级模型,例如Resnes和VGG,而且我不确定如何解决此技术问题。

谢谢您的回复!

Given a pre-trained ResNet152, in trying to calculate predictions bench-marks using some common datasets (using PyTorch), and the first RGB dataset that came to mind was CIFAR10. The thing is that CIFAR10 data is 3x32x32 and ResNet expects 3x224x224. I've resized the data using the known approach of transforms:

preprocess = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
train = datasets.CIFAR10(root='./data', train=True, download=True, transform=preprocess)
test = datasets.CIFAR10(root='./data', train=False, download=True, transform=preprocess)
train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size)
test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size)

but this results in blurry samples and bad predictions. I was wondering what are the best approaches in those cases, as I see many papers using those datasets given advanced models like ResNes and VGG, and I'm not sure how this technical issue could be resolved.

thank you for your response!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

泪意 2025-01-29 22:31:40

是的,您需要将输入图像大小调整到大小3x224x224。通过这样做,经过正常的培训程序,您应该在CIFAR-10上取得出色的成绩(例如测试集中的96%)。

我想主要的问题是,您正在使用一个在更高分辨率图像(RESNET152在Imagenet上进行训练)的网络,而没有任何其他培训,您就无法期望良好的结果大幅度更改数据集。

Yes, you need to resize input images to the size 3x224x224. By doing so, after a normal training procedure, you should achieve outstanding results on CIFAR-10 (like 96% on the test-set).

I guess the main problem is that you're using a network that is pre-trained on higher resolution images (resnet152 comes pre-trained on imageNet), without any other training you can't expect good results changing the dataset drastically.

千秋岁 2025-01-29 22:31:40

对于图像分类基准测试,我建议您使用具有更高分辨率的常用数据集(也可以预定的火炬视觉): lsun noreflow noreferrer“> placs365

For image classification benchmarks, I recommend you to use the commonly used datasets (also pre-defined inside torch vision) with higher resolution: LSUN or Places365.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文