开始微调时的损失高于转移学习的损失
由于我开始通过转移学习所学的权重进行微调,因此我希望损失相同或更少。但是,看起来它开始使用不同的起始权重进行微调。
开始转移学习的代码:
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
base_model.trainable = False
model = tf.keras.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(units=3, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
epochs = 1000
callback = tf.keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)
history = model.fit(train_generator,
steps_per_epoch=len(train_generator),
epochs=epochs,
validation_data=val_generator,
validation_steps=len(val_generator),
callbacks=[callback],)
从上一个时期的输出:
Epoch 29/1000
232/232 [==============================] - 492s 2s/step - loss: 0.1298 - accuracy: 0.8940 - val_loss: 0.1220 - val_accuracy: 0.8937
开始进行微调的代码:
model.trainable = True
# Fine-tune from this layer onwards
fine_tune_at = -20
# Freeze all the layers before the `fine_tune_at` layer
for layer in model.layers[:fine_tune_at]:
layer.trainable = False
model.compile(optimizer=tf.keras.optimizers.Adam(1e-5),
loss='binary_crossentropy',
metrics=['accuracy'])
history_fine = model.fit(train_generator,
steps_per_epoch=len(train_generator),
epochs=epochs,
validation_data=val_generator,
validation_steps=len(val_generator),
callbacks=[callback],)
但这是我看到的前几个时期:
Epoch 1/1000
232/232 [==============================] - ETA: 0s - loss: 0.3459 - accuracy: 0.8409/usr/local/lib/python3.7/dist-packages/PIL/Image.py:960: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images
"Palette images with Transparency expressed in bytes should be "
232/232 [==============================] - 509s 2s/step - loss: 0.3459 - accuracy: 0.8409 - val_loss: 0.7755 - val_accuracy: 0.7262
Epoch 2/1000
232/232 [==============================] - 502s 2s/step - loss: 0.1889 - accuracy: 0.9066 - val_loss: 0.5628 - val_accuracy: 0.8881
最终损失下降并通过了转移学习损失:
Epoch 87/1000
232/232 [==============================] - 521s 2s/step - loss: 0.0232 - accuracy: 0.8312 - val_loss: 0.0481 - val_accuracy: 0.8563
为什么第一个时代的损失是Fine的第一个时代的损失调谐高于转移学习的最后一次损失?
Since I start fine tuning with the weights learned by transfer learning, I would expect the loss to be the same or less. However it looks like it starts fine tuning using a different set of starting weights.
Code to start transfer learning:
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
base_model.trainable = False
model = tf.keras.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(units=3, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
epochs = 1000
callback = tf.keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)
history = model.fit(train_generator,
steps_per_epoch=len(train_generator),
epochs=epochs,
validation_data=val_generator,
validation_steps=len(val_generator),
callbacks=[callback],)
Output from last epoch:
Epoch 29/1000
232/232 [==============================] - 492s 2s/step - loss: 0.1298 - accuracy: 0.8940 - val_loss: 0.1220 - val_accuracy: 0.8937
Code to start fine tuning:
model.trainable = True
# Fine-tune from this layer onwards
fine_tune_at = -20
# Freeze all the layers before the `fine_tune_at` layer
for layer in model.layers[:fine_tune_at]:
layer.trainable = False
model.compile(optimizer=tf.keras.optimizers.Adam(1e-5),
loss='binary_crossentropy',
metrics=['accuracy'])
history_fine = model.fit(train_generator,
steps_per_epoch=len(train_generator),
epochs=epochs,
validation_data=val_generator,
validation_steps=len(val_generator),
callbacks=[callback],)
But this is what I see for the first few epochs:
Epoch 1/1000
232/232 [==============================] - ETA: 0s - loss: 0.3459 - accuracy: 0.8409/usr/local/lib/python3.7/dist-packages/PIL/Image.py:960: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images
"Palette images with Transparency expressed in bytes should be "
232/232 [==============================] - 509s 2s/step - loss: 0.3459 - accuracy: 0.8409 - val_loss: 0.7755 - val_accuracy: 0.7262
Epoch 2/1000
232/232 [==============================] - 502s 2s/step - loss: 0.1889 - accuracy: 0.9066 - val_loss: 0.5628 - val_accuracy: 0.8881
Eventually the loss drops and passes the transfer learning loss:
Epoch 87/1000
232/232 [==============================] - 521s 2s/step - loss: 0.0232 - accuracy: 0.8312 - val_loss: 0.0481 - val_accuracy: 0.8563
Why was the loss in the first epoch of fine tuning higher than the last loss from transfer learning?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据TensorFlow的说法,KERAS页面上的传输学习和微调 link 。批处理层的参数应单独保留。
以下是我所做的,这解决了解冻层后突然增加损失的问题:
According to Tensorflow, Keras page on Transfer learning and fine-tuning link. The params of the Batch Norm layer should be left alone.
Below is what I did that fixed the issue of sudden increase in loss after unfreeze layers: