为什么编码表示不利于分类?
给定一个预先训练的性能良好的自动编码器。当我在编码(由自动编码器生成)上训练分类器时,分类器的表现非常差。特别是,它比在正常输入(即未编码输入)上训练分类器要差得多。
然而,当我根据分类损失微调编码器时,分类器表现得相当好。
为什么编码表示不利于分类?
详细信息:我正在研究 CIFAR-100 并尝试对粗图像标签进行分类,即 20 个类别(但我认为在 CIFAR-10 上进行分类时我遇到了同样的问题)。分类器有 5 层,我使用 dropout:
classifier = tf.keras.Sequential([
tf.keras.layers.Dense(512,
activation='relu',
name='classifier_hidden_1'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(256,
activation='relu',
name='classifier_hidden_2'),
tf.keras.layers.Dense(128,
activation='relu',
name='classifier_hidden_3'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(64,
activation='relu',
name='classifier_hidden_4'),
tf.keras.layers.Dense(num_classes,
activation=None,
name='classifier_out'),
], name='classifier')
Given a pre-trained well-performing auto-encoder. When I train a classifier on encodings (produced by the auto-encoder) the classifier does very poorly. In particular, it does much worse than training a classifier on normal inputs (i.e. unencoded inputs).
However, when I fine-tune the encoder based on classification loss, the classifier does quite well.
Why are encoded representations bad for classification?
Details: I’m working on CIFAR-100 and trying to classify coarse image labels, i.e. 20 classes (but I think I had the same problem when doing classification on CIFAR-10). The classifier has 5 layers and I’m using dropout:
classifier = tf.keras.Sequential([
tf.keras.layers.Dense(512,
activation='relu',
name='classifier_hidden_1'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(256,
activation='relu',
name='classifier_hidden_2'),
tf.keras.layers.Dense(128,
activation='relu',
name='classifier_hidden_3'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(64,
activation='relu',
name='classifier_hidden_4'),
tf.keras.layers.Dense(num_classes,
activation=None,
name='classifier_out'),
], name='classifier')
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论