用于 CNN 的 Keras Hyperband 调谐器的分类精度始终等于 1/3
I am attempting to build and optimise a CNN for classification of pneumonia types (bacterial / viral / no pneumonia) using the "Chest X-Ray Images (Pneumonia) with new class” Kaggle dateset (
肺炎 我正在使用GPU进行训练,而不是tpu。
=“ nofollow noreferrer”> https://keras.io/examples/vision/vision/xray_classification_with_with_tpus/ ,尽管 如下所述: httpps:///www.tensorflow.gg/tensorflow.org/tutorflow.org/tutorials/keras/keras/kerass_erass_eraseass_tuner.
这是我构建的模型(image_size [0]
和image_size [1]
既相同又等于250,batch_size IS 32):
def convolution_block(filters, inputs):
x = layers.SeparableConvolution2D(
filters,
3,
activation = "relu",
padding = "same")(inputs)
x = layers.SeparableConvolution2D(
filters,
3,
activation = "relu",
padding = "same")(x)
x = layers.BatchNormalization()(x)
outputs = layers.MaxPooling2D()(x)
return outputs
def dense_block(units, dropout_rate, inputs):
x = layers.Dense(
units,
activation = "relu",
)(inputs)
x = layers.BatchNormalization()(x)
outputs = layers.Dropout(
dropout_rate
)(x)
return outputs
def model_builder_for_tuning(hp):
inputs = keras.Input(
shape = (IMAGE_SIZE[0], IMAGE_SIZE[1], 3)
)
x = layers.Rescaling(1.0 / 255)(inputs)
kernel_size1 = hp.Choice(
name = "kernel_size1",
values = [3, 5, 7]
)
x = layers.Convolution2D(
16,
kernel_size1,
activation = "relu",
padding = "same"
)(x)
x = layers.Convolution2D(
16,
kernel_size1,
activation = "relu",
padding = "same"
)(x)
x = layers.MaxPooling2D()(x)
filters_layer2 = hp.Choice(
name = "filter_layer2",
values = [32, 64]
)
filters_layer3 = hp.Choice(
name = "filter_layer3",
values = [64, 128]
)
filters_layer4 = hp.Choice(
name = "filter_layer4",
values = [128, 256]
)
filters_layer5 = hp.Choice(
name = "filter_layer5",
values = [256, 512]
)
dropout_rate = hp.Choice(
name = "convolution_dropout",
values = [0.2, 0.4, 0.6]
)
x = convolution_block(filters_layer2, x)
x = convolution_block(filters_layer3, x)
x = convolution_block(filters_layer4, x)
x = layers.Dropout(dropout_rate)(x)
x = convolution_block(filters_layer5, x)
x = layers.Dropout(dropout_rate)(x)
x = layers.Flatten()(x)
x = dense_block(512, 0.7, x)
x = dense_block(128, 0.5, x)
x = dense_block(64, 0.3, x)
outputs = layers.Dense(3, activation = "softmax")(x)
model = keras.Model(inputs = inputs, outputs = outputs)
METRICS = [
tf.keras.metrics.CategoricalAccuracy(name = "cat_accuracy"),
tf.keras.metrics.Precision(name = "precision"),
tf.keras.metrics.Recall(name = "recall")
]
learning_rate_hp = hp.Choice(
name = "learning_rate",
values = [1e-2, 1e-3, 1e-4],
)
model.compile(
optimizer = tf.keras.optimizers.Adam(
learning_rate = learning_rate_hp,
),
loss = "categorical_crossentropy",
metrics = METRICS,
)
return model
调谐器是使用以下方式实例化的:
objective = kt.Objective("val_cat_accuracy", direction = "max")
tuner = kt.Hyperband(
model_builder_for_tuning,
objective = objective,
max_epochs = 10,
factor = 3,
directory = "/content/drive/MyDrive/Colab Notebooks/hp_xray-new",
project_name = "chest_xray_hp_tuning",
)
HyperParameters搜索是使用:
with tf.device('/device:GPU:0'):
tuner.search(
train_generator,
epochs = 25,
validation_data = val_generator,
callbacks = [early_stop_callback_hpt]
)
使用Generator的训练/验证的图像数据使用生成器加载:
train_data_generator = tf.keras.preprocessing.image.ImageDataGenerator(
zoom_range = 0.3,
vertical_flip = True,
)
test_val_data_generator = tf.keras.preprocessing.image.ImageDataGenerator()
and Flow_from_directory函数:
train_generator = train_data_generator.flow_from_directory(
directory = train_dir,
target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
batch_size = BATCH_SIZE,
class_mode = "categorical",
shuffle = True
)
val_generator = test_val_data_generator.flow_from_directory(
directory = val_dir,
target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
batch_size = BATCH_SIZE,
class_mode = "categorical",
shuffle = False
)
test_generator = test_val_data_generator.flow_from_directory(
directory = test_dir,
target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
batch_size = BATCH_SIZE,
class_mode = "categorical",
shuffle = False
)
使用正确的输出:
Found 3900 images belonging to 3 classes.
Found 279 images belonging to 3 classes.
Found 300 images belonging to 3 classes.
但是,一旦我启动HyperParameters调谐,,就一旦我开始使用HyperParameters,就可以使用。我得到非常令人不安的输出,例如:
Trial 5 Complete [00h 02m 55s]
val_cat_accuracy: 0.3333333432674408
Best val_cat_accuracy So Far: 0.3333333432674408
Total elapsed time: 00h 16m 15s
Search: Running Trial #6
Value |Best Value So Far |Hyperparameter
5 |7 |kernel_size1
32 |64 |filter_layer2
64 |64 |filter_layer3
128 |128 |filter_layer4
512 |256 |filter_layer5
0.2 |0.4 |convolution_dropout
0.001 |0.0001 |learning_rate
2 |2 |tuner/epochs
0 |0 |tuner/initial_epoch
2 |2 |tuner/bracket
0 |0 |tuner/round
Epoch 1/2
122/122 [==============================] - 90s 720ms/step - loss: 1.1538 - cat_accuracy: 0.5269 - precision: 0.5461 - recall: 0.4723 - val_loss: 1.1022 - val_cat_accuracy: 0.3333 - val_precision: 0.0000e+00 - val_recall: 0.0000e+00
val_cat_accuracy
被卡在0.33(3)左右,而val_precision
和val_recall
是0或0.33(3)。
令人惊讶的是,当我构建初始模型(具有一些随机选择的内核大小,过滤器数量和辍学率)时,它在36个时期内接受了val_cat_accuracy:0.7348
::
感谢您通过调整超参数调整这种奇怪行为的可能原因。
谢谢! :-)
I am attempting to build and optimise a CNN for classification of pneumonia types (bacterial / viral / no pneumonia) using the "Chest X-Ray Images (Pneumonia) with new class” Kaggle dateset (https://www.kaggle.com/datasets/ahmedhaytham/chest-xray-images-pneumonia-with-new-class).
I relied on a pneumonia classification CNN model described in one of Keras’ tutorials: https://keras.io/examples/vision/xray_classification_with_tpus/, although I am using GPU for training instead of a TPU.
In addition, I am trying to tune some model parameters using Keras Hyperband tuner, as described here: https://www.tensorflow.org/tutorials/keras/keras_tuner
This is the model I built (IMAGE_SIZE[0]
and IMAGE_SIZE[1]
are both the same and equal to 250, and BATCH_SIZE
is 32):
def convolution_block(filters, inputs):
x = layers.SeparableConvolution2D(
filters,
3,
activation = "relu",
padding = "same")(inputs)
x = layers.SeparableConvolution2D(
filters,
3,
activation = "relu",
padding = "same")(x)
x = layers.BatchNormalization()(x)
outputs = layers.MaxPooling2D()(x)
return outputs
def dense_block(units, dropout_rate, inputs):
x = layers.Dense(
units,
activation = "relu",
)(inputs)
x = layers.BatchNormalization()(x)
outputs = layers.Dropout(
dropout_rate
)(x)
return outputs
def model_builder_for_tuning(hp):
inputs = keras.Input(
shape = (IMAGE_SIZE[0], IMAGE_SIZE[1], 3)
)
x = layers.Rescaling(1.0 / 255)(inputs)
kernel_size1 = hp.Choice(
name = "kernel_size1",
values = [3, 5, 7]
)
x = layers.Convolution2D(
16,
kernel_size1,
activation = "relu",
padding = "same"
)(x)
x = layers.Convolution2D(
16,
kernel_size1,
activation = "relu",
padding = "same"
)(x)
x = layers.MaxPooling2D()(x)
filters_layer2 = hp.Choice(
name = "filter_layer2",
values = [32, 64]
)
filters_layer3 = hp.Choice(
name = "filter_layer3",
values = [64, 128]
)
filters_layer4 = hp.Choice(
name = "filter_layer4",
values = [128, 256]
)
filters_layer5 = hp.Choice(
name = "filter_layer5",
values = [256, 512]
)
dropout_rate = hp.Choice(
name = "convolution_dropout",
values = [0.2, 0.4, 0.6]
)
x = convolution_block(filters_layer2, x)
x = convolution_block(filters_layer3, x)
x = convolution_block(filters_layer4, x)
x = layers.Dropout(dropout_rate)(x)
x = convolution_block(filters_layer5, x)
x = layers.Dropout(dropout_rate)(x)
x = layers.Flatten()(x)
x = dense_block(512, 0.7, x)
x = dense_block(128, 0.5, x)
x = dense_block(64, 0.3, x)
outputs = layers.Dense(3, activation = "softmax")(x)
model = keras.Model(inputs = inputs, outputs = outputs)
METRICS = [
tf.keras.metrics.CategoricalAccuracy(name = "cat_accuracy"),
tf.keras.metrics.Precision(name = "precision"),
tf.keras.metrics.Recall(name = "recall")
]
learning_rate_hp = hp.Choice(
name = "learning_rate",
values = [1e-2, 1e-3, 1e-4],
)
model.compile(
optimizer = tf.keras.optimizers.Adam(
learning_rate = learning_rate_hp,
),
loss = "categorical_crossentropy",
metrics = METRICS,
)
return model
The tuner is instantiated using:
objective = kt.Objective("val_cat_accuracy", direction = "max")
tuner = kt.Hyperband(
model_builder_for_tuning,
objective = objective,
max_epochs = 10,
factor = 3,
directory = "/content/drive/MyDrive/Colab Notebooks/hp_xray-new",
project_name = "chest_xray_hp_tuning",
)
and the hyperparameters search is initialised with:
with tf.device('/device:GPU:0'):
tuner.search(
train_generator,
epochs = 25,
validation_data = val_generator,
callbacks = [early_stop_callback_hpt]
)
Image data for training/validation is loaded using generators:
train_data_generator = tf.keras.preprocessing.image.ImageDataGenerator(
zoom_range = 0.3,
vertical_flip = True,
)
test_val_data_generator = tf.keras.preprocessing.image.ImageDataGenerator()
and flow_from_directory function:
train_generator = train_data_generator.flow_from_directory(
directory = train_dir,
target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
batch_size = BATCH_SIZE,
class_mode = "categorical",
shuffle = True
)
val_generator = test_val_data_generator.flow_from_directory(
directory = val_dir,
target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
batch_size = BATCH_SIZE,
class_mode = "categorical",
shuffle = False
)
test_generator = test_val_data_generator.flow_from_directory(
directory = test_dir,
target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
batch_size = BATCH_SIZE,
class_mode = "categorical",
shuffle = False
)
with the correct output:
Found 3900 images belonging to 3 classes.
Found 279 images belonging to 3 classes.
Found 300 images belonging to 3 classes.
However, as soon as I start hyperparameters tuning, I get very disturbing outputs, such as:
Trial 5 Complete [00h 02m 55s]
val_cat_accuracy: 0.3333333432674408
Best val_cat_accuracy So Far: 0.3333333432674408
Total elapsed time: 00h 16m 15s
Search: Running Trial #6
Value |Best Value So Far |Hyperparameter
5 |7 |kernel_size1
32 |64 |filter_layer2
64 |64 |filter_layer3
128 |128 |filter_layer4
512 |256 |filter_layer5
0.2 |0.4 |convolution_dropout
0.001 |0.0001 |learning_rate
2 |2 |tuner/epochs
0 |0 |tuner/initial_epoch
2 |2 |tuner/bracket
0 |0 |tuner/round
Epoch 1/2
122/122 [==============================] - 90s 720ms/step - loss: 1.1538 - cat_accuracy: 0.5269 - precision: 0.5461 - recall: 0.4723 - val_loss: 1.1022 - val_cat_accuracy: 0.3333 - val_precision: 0.0000e+00 - val_recall: 0.0000e+00
It seems that val_cat_accuracy
is stuck around 0.33(3), and the val_precision
and val_recall
are either 0 or 0.33(3) as well.
Surprisingly, when I built initial model (with some randomly chosen kernel sizes, number of filters, and dropout rates), it was trained within 36 epochs with val_cat_accuracy: 0.7348
:
I would appreciate any help in pinpointing the possible reasons for this strange behaviour with hyperparameters tuning.
Thanks! :-)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论