使用 resnet50 时如何检测不在训练类别图像中
我已经在四类图像上训练了 resnet50。当我向它提供四个类别中任何一个类别的图像时,它的效果非常好——我对这些类别中的图像基本上具有 100% 的准确率。
然而,当我向经过训练的 Resnet50 模型提供相似对象的图像(但不属于原始四个类别之一)时,预测会返回为四个现有类别之一。我的意思是,在返回每个类别的可能性的数组中,在许多情况下,其中一个类别的可能性基本上为 1。例如,当我查询关于图像的模型时,不是
[1.3492944e-07 9.9999988e-01 8.3132584e-14 1.4716975e-24]
这是模型所训练的图像的预测数组:
[1.8217645e-27 1.0000000e+00 3.6731971e-32 0.0000000e+00]
这些分数是不同的,但差别不大。许多不属于训练类别之一的图像的其中一个标签的值为 1.00000000。
我一直计划通过查看预测数组来处理奇怪的图像,以查看最大值(类别标签预测)是否低于某个阈值。但我的大多数最大值(类别标签预测)都高于 0.99999,因此我无法区分训练集中的图像和不属于训练集的图像。
我计划训练 N 个桶的模型。当我运行系统时,我偶尔会遇到不在 N 个存储桶之一中的图像,我需要知道这一点。我不在乎它们是什么,我只想知道图像何时不在 N 个桶中。
Resnet50 在将所有内容强制归入某一类别方面做得非常出色,即使事实并非如此。
我的图像非常清晰!我想知道我是否训练过度或忽视了其他一些明显的错误。
以下是正确分类的图像示例:
这里是不属于训练集的图像,然后被分类为以下类别之一:
总之:我正在尝试对图像进行排序,我需要知道其中一张图像何时不属于训练类别,以便我可以拒绝该图像。重申一下,我想将图像分类到存储桶中:已知的、经过训练的存储桶和一个未知的存储桶。
有什么办法可以做到这一点吗?
我应该使用与 Resnet50 不同的分类器吗?
我的图像是灰度的,在调整大小(从大到小)期间进行双三次插值,150x150。每个类别我有大约 1,600 个训练图像和 200 个验证图像。 3 个时期后,我的准确度和 val_accuracy 为 0.9997。
I have trained resnet50 on four categories of images. It works fantastic when I feed it an image in any one of the four categories -- I have essentially 100% accuracy on images in these categories.
However, when I feed my trained Resnet50 model an image of a similar object, but not in one of the original four categories, the prediction comes back as one of the four existing classes. By this I mean, in the array that is returned with the likelihood of each category, in many cases the likelihood of one of the categories is basically 1. For example, when I query the model about image that is not in one of the four categories, the prediction array will look like
[1.3492944e-07 9.9999988e-01 8.3132584e-14 1.4716975e-24]
Here is the prediction array for an image that the model was trained on:
[1.8217645e-27 1.0000000e+00 3.6731971e-32 0.0000000e+00]
These scores are different, but not much different. Many of the images that are not in one of the trained-for categories have a 1.00000000 for one of the labels.
I had been planning on dealing with the oddball images by looking at the prediction array to see if the max(category labels prediction) was below some threshold. But most of my max(category labels predictions) are all above .99999 and so I can't differentiate between images in the training set and images not part of the training set.
I plan to train my model for N buckets. When I am running the system I will occasionally have images that are not in one of the N buckets and I need to know that. I don't care what they are, I just want to know when an image is not in one of the N buckets.
Resnet50 does a great job of forcing everything into one of the categories, even when it is not.
My images are super well defined! I wonder if I am somehow overtraining or overlooking some other obvious error.
Here is an example of an image that was correctly categorized:
in training set and correctly categorized
Here is an image that is not part of the training set that was then categorized into one of the categories:
not in training set and incorrectly categorized
In summary: I am trying to sort images and I need to know when one of the images is not part of the training categories so I can reject that image. Restated, I want to sort images into buckets: known, trained for buckets, and one unknown bucket.
Is there any way to do this?
Should I use a different classifier than Resnet50?
My images are grayscale, bicubic interpolated during resize (large to smaller), 150x150. I have about 1,600 training images and 200 validation images per category. My accuracy and val_accuracy are .9997 after 3 epochs.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你的模型只知道大约 4 个类。它或任何其他模型表示 MobileNet 将始终查看图像并为 4 个类别中的每个类别分配概率。你可以输入一张水牛的图片,它仍然会尝试对其进行分类。通常但并非总是如此,如果您输入的类外图像与训练图像非常不同,则概率最高的类的概率值将远低于 1.0。然而,在您的情况下,类外图像与数据集中的图像并没有那么不同,因此错误概率预测相当高。
我能想到的是,如果您的类外图像通常彼此相似,您可以创建第五类并使用您拥有的数据训练您的模型,并收集一些“典型”类外图像。然后在这 5 个类别上训练模型。我制作了一个模型,对 50 种不同的狗品种进行分类。这是非常准确的。我放了一张唐纳德·特朗普的照片,他被预测为吉娃娃!
Your model only knows about 4 classes. It or any other model say MobileNet will always look at an image and assign probabilities to each of the 4 classes. You could put in a picture of a water buffalo and it will still try to classify it. Usually but not always if the out of class image you put in is very different from your training images the class with the highest probability will have a probability value well below 1.0. However in your case the out of class image is NOT all that different from the images in your dataset hence a fairly high false probability prediction.
All I can think off is if your out of class images will be generically similar to each other you could create a 5th class and train your model with the data you have plus gather some "typical" out of class images. Then train the model on these 5 classes. I made a model that classified 50 different dog breeds. It was extremely accurate. I put in a picture of Donald Trump and he was predicted as being a chihuahua!