用于 CNN 的 Keras Hyperband 调谐器的分类精度始终等于 1/3

发布于 2025-01-17 14:51:16 字数 6783 浏览 2 评论 0原文

I am attempting to build and optimise a CNN for classification of pneumonia types (bacterial / viral / no pneumonia) using the "Chest X-Ray Images (Pneumonia) with new class” Kaggle dateset (

肺炎我正在使用GPU进行训练，而不是tpu。

=“ nofollow noreferrer”> https://keras.io/examples/vision/vision/xray_classification_with_with_tpus/ ，尽管如下所述： httpps：///www.tensorflow.gg/tensorflow.org/tutorflow.org/tutorials/keras/keras/kerass_erass_eraseass_tuner.

这是我构建的模型（image_size [0]和image_size [1]既相同又等于250，batch_size IS 32）：

def convolution_block(filters, inputs):
    x = layers.SeparableConvolution2D(
        filters, 
        3, 
        activation = "relu",
        padding = "same")(inputs)

    x = layers.SeparableConvolution2D(
        filters, 
        3, 
        activation = "relu",
        padding = "same")(x)

    x = layers.BatchNormalization()(x)

    outputs = layers.MaxPooling2D()(x)

    return outputs      

def dense_block(units, dropout_rate, inputs):
    x = layers.Dense(
        units,
        activation = "relu",
    )(inputs)

    x = layers.BatchNormalization()(x)

    outputs = layers.Dropout(
        dropout_rate
    )(x)

    return outputs

def model_builder_for_tuning(hp):
    inputs = keras.Input(
        shape = (IMAGE_SIZE[0], IMAGE_SIZE[1], 3)
    )

    x = layers.Rescaling(1.0 / 255)(inputs)

    kernel_size1 = hp.Choice(
        name = "kernel_size1",
        values = [3, 5, 7]
    )

    x = layers.Convolution2D(
        16,
        kernel_size1,
        activation = "relu",
        padding = "same"
    )(x)
    x = layers.Convolution2D(
        16,
        kernel_size1,
        activation = "relu",
        padding = "same"
    )(x)
    x = layers.MaxPooling2D()(x)

    filters_layer2 = hp.Choice(
        name = "filter_layer2",
        values = [32, 64]
    )
    filters_layer3 = hp.Choice(
        name = "filter_layer3",
        values = [64, 128]
    )
    filters_layer4 = hp.Choice(
        name = "filter_layer4",
        values = [128, 256]
    )
    filters_layer5 = hp.Choice(
        name = "filter_layer5",
        values = [256, 512]
    )

    dropout_rate = hp.Choice(
        name = "convolution_dropout",
        values = [0.2, 0.4, 0.6]
    )

    x = convolution_block(filters_layer2, x)
    x = convolution_block(filters_layer3, x)
    
    x = convolution_block(filters_layer4, x)
    x = layers.Dropout(dropout_rate)(x)

    x = convolution_block(filters_layer5, x)
    x = layers.Dropout(dropout_rate)(x)

    x = layers.Flatten()(x)

    x = dense_block(512, 0.7, x)
    x = dense_block(128, 0.5, x)
    x = dense_block(64, 0.3, x)

    outputs = layers.Dense(3, activation = "softmax")(x)

    model = keras.Model(inputs = inputs, outputs = outputs)

    METRICS = [
        tf.keras.metrics.CategoricalAccuracy(name = "cat_accuracy"),
        tf.keras.metrics.Precision(name = "precision"),
        tf.keras.metrics.Recall(name = "recall")
    ]

    learning_rate_hp = hp.Choice(
        name = "learning_rate",
        values = [1e-2, 1e-3, 1e-4],
    )

    model.compile(
        optimizer = tf.keras.optimizers.Adam(
            learning_rate = learning_rate_hp,
        ),
        loss = "categorical_crossentropy",
        metrics = METRICS,
    )

    return model

调谐器是使用以下方式实例化的：

objective = kt.Objective("val_cat_accuracy", direction = "max")

tuner = kt.Hyperband(
    model_builder_for_tuning,
    objective = objective,
    max_epochs = 10,
    factor = 3,
    directory = "/content/drive/MyDrive/Colab Notebooks/hp_xray-new",
    project_name = "chest_xray_hp_tuning",
)

HyperParameters搜索是使用：

with tf.device('/device:GPU:0'):
    tuner.search(
        train_generator,
        epochs = 25,
        validation_data = val_generator,
        callbacks = [early_stop_callback_hpt]
    )

使用Generator的训练/验证的图像数据使用生成器加载：

train_data_generator = tf.keras.preprocessing.image.ImageDataGenerator(
    zoom_range = 0.3,
    vertical_flip = True,
)
test_val_data_generator = tf.keras.preprocessing.image.ImageDataGenerator()

and Flow_from_directory函数：

train_generator = train_data_generator.flow_from_directory(
    directory = train_dir,
    target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
    batch_size = BATCH_SIZE,
    class_mode = "categorical",
    shuffle = True
)

val_generator = test_val_data_generator.flow_from_directory(
    directory = val_dir,
    target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
    batch_size = BATCH_SIZE,
    class_mode = "categorical",
    shuffle = False
)

test_generator = test_val_data_generator.flow_from_directory(
    directory = test_dir,
    target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
    batch_size = BATCH_SIZE,
    class_mode = "categorical",
    shuffle = False
)

使用正确的输出：

Found 3900 images belonging to 3 classes.
Found 279 images belonging to 3 classes.
Found 300 images belonging to 3 classes.

但是，一旦我启动HyperParameters调谐，，就一旦我开始使用HyperParameters，就可以使用。我得到非常令人不安的输出，例如：

Trial 5 Complete [00h 02m 55s]
val_cat_accuracy: 0.3333333432674408

Best val_cat_accuracy So Far: 0.3333333432674408
Total elapsed time: 00h 16m 15s

Search: Running Trial #6

Value             |Best Value So Far |Hyperparameter
5                 |7                 |kernel_size1
32                |64                |filter_layer2
64                |64                |filter_layer3
128               |128               |filter_layer4
512               |256               |filter_layer5
0.2               |0.4               |convolution_dropout
0.001             |0.0001            |learning_rate
2                 |2                 |tuner/epochs
0                 |0                 |tuner/initial_epoch
2                 |2                 |tuner/bracket
0                 |0                 |tuner/round

Epoch 1/2
122/122 [==============================] - 90s 720ms/step - loss: 1.1538 - cat_accuracy: 0.5269 - precision: 0.5461 - recall: 0.4723 - val_loss: 1.1022 - val_cat_accuracy: 0.3333 - val_precision: 0.0000e+00 - val_recall: 0.0000e+00

val_cat_accuracy被卡在0.33（3）左右，而val_precision和val_recall是0或0.33（3）。

令人惊讶的是，当我构建初始模型（具有一些随机选择的内核大小，过滤器数量和辍学率）时，它在36个时期内接受了val_cat_accuracy：0.7348：：

感谢您通过调整超参数调整这种奇怪行为的可能原因。

谢谢！ :-)

原文

I am attempting to build and optimise a CNN for classification of pneumonia types (bacterial / viral / no pneumonia) using the "Chest X-Ray Images (Pneumonia) with new class” Kaggle dateset (https://www.kaggle.com/datasets/ahmedhaytham/chest-xray-images-pneumonia-with-new-class).

I relied on a pneumonia classification CNN model described in one of Keras’ tutorials: https://keras.io/examples/vision/xray_classification_with_tpus/, although I am using GPU for training instead of a TPU.

In addition, I am trying to tune some model parameters using Keras Hyperband tuner, as described here: https://www.tensorflow.org/tutorials/keras/keras_tuner

This is the model I built (IMAGE_SIZE[0] and IMAGE_SIZE[1] are both the same and equal to 250, and BATCH_SIZE is 32):

def convolution_block(filters, inputs):
    x = layers.SeparableConvolution2D(
        filters, 
        3, 
        activation = "relu",
        padding = "same")(inputs)

    x = layers.SeparableConvolution2D(
        filters, 
        3, 
        activation = "relu",
        padding = "same")(x)

    x = layers.BatchNormalization()(x)

    outputs = layers.MaxPooling2D()(x)

    return outputs      

def dense_block(units, dropout_rate, inputs):
    x = layers.Dense(
        units,
        activation = "relu",
    )(inputs)

    x = layers.BatchNormalization()(x)

    outputs = layers.Dropout(
        dropout_rate
    )(x)

    return outputs

def model_builder_for_tuning(hp):
    inputs = keras.Input(
        shape = (IMAGE_SIZE[0], IMAGE_SIZE[1], 3)
    )

    x = layers.Rescaling(1.0 / 255)(inputs)

    kernel_size1 = hp.Choice(
        name = "kernel_size1",
        values = [3, 5, 7]
    )

    x = layers.Convolution2D(
        16,
        kernel_size1,
        activation = "relu",
        padding = "same"
    )(x)
    x = layers.Convolution2D(
        16,
        kernel_size1,
        activation = "relu",
        padding = "same"
    )(x)
    x = layers.MaxPooling2D()(x)

    filters_layer2 = hp.Choice(
        name = "filter_layer2",
        values = [32, 64]
    )
    filters_layer3 = hp.Choice(
        name = "filter_layer3",
        values = [64, 128]
    )
    filters_layer4 = hp.Choice(
        name = "filter_layer4",
        values = [128, 256]
    )
    filters_layer5 = hp.Choice(
        name = "filter_layer5",
        values = [256, 512]
    )

    dropout_rate = hp.Choice(
        name = "convolution_dropout",
        values = [0.2, 0.4, 0.6]
    )

    x = convolution_block(filters_layer2, x)
    x = convolution_block(filters_layer3, x)
    
    x = convolution_block(filters_layer4, x)
    x = layers.Dropout(dropout_rate)(x)

    x = convolution_block(filters_layer5, x)
    x = layers.Dropout(dropout_rate)(x)

    x = layers.Flatten()(x)

    x = dense_block(512, 0.7, x)
    x = dense_block(128, 0.5, x)
    x = dense_block(64, 0.3, x)

    outputs = layers.Dense(3, activation = "softmax")(x)

    model = keras.Model(inputs = inputs, outputs = outputs)

    METRICS = [
        tf.keras.metrics.CategoricalAccuracy(name = "cat_accuracy"),
        tf.keras.metrics.Precision(name = "precision"),
        tf.keras.metrics.Recall(name = "recall")
    ]

    learning_rate_hp = hp.Choice(
        name = "learning_rate",
        values = [1e-2, 1e-3, 1e-4],
    )

    model.compile(
        optimizer = tf.keras.optimizers.Adam(
            learning_rate = learning_rate_hp,
        ),
        loss = "categorical_crossentropy",
        metrics = METRICS,
    )

    return model

The tuner is instantiated using:

objective = kt.Objective("val_cat_accuracy", direction = "max")

tuner = kt.Hyperband(
    model_builder_for_tuning,
    objective = objective,
    max_epochs = 10,
    factor = 3,
    directory = "/content/drive/MyDrive/Colab Notebooks/hp_xray-new",
    project_name = "chest_xray_hp_tuning",
)

and the hyperparameters search is initialised with:

with tf.device('/device:GPU:0'):
    tuner.search(
        train_generator,
        epochs = 25,
        validation_data = val_generator,
        callbacks = [early_stop_callback_hpt]
    )

Image data for training/validation is loaded using generators:

train_data_generator = tf.keras.preprocessing.image.ImageDataGenerator(
    zoom_range = 0.3,
    vertical_flip = True,
)
test_val_data_generator = tf.keras.preprocessing.image.ImageDataGenerator()

and flow_from_directory function:

train_generator = train_data_generator.flow_from_directory(
    directory = train_dir,
    target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
    batch_size = BATCH_SIZE,
    class_mode = "categorical",
    shuffle = True
)

val_generator = test_val_data_generator.flow_from_directory(
    directory = val_dir,
    target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
    batch_size = BATCH_SIZE,
    class_mode = "categorical",
    shuffle = False
)

test_generator = test_val_data_generator.flow_from_directory(
    directory = test_dir,
    target_size = (IMAGE_SIZE[0], IMAGE_SIZE[1]),
    batch_size = BATCH_SIZE,
    class_mode = "categorical",
    shuffle = False
)

with the correct output:

Found 3900 images belonging to 3 classes.
Found 279 images belonging to 3 classes.
Found 300 images belonging to 3 classes.

However, as soon as I start hyperparameters tuning, I get very disturbing outputs, such as:

Trial 5 Complete [00h 02m 55s]
val_cat_accuracy: 0.3333333432674408

Best val_cat_accuracy So Far: 0.3333333432674408
Total elapsed time: 00h 16m 15s

Search: Running Trial #6

Value             |Best Value So Far |Hyperparameter
5                 |7                 |kernel_size1
32                |64                |filter_layer2
64                |64                |filter_layer3
128               |128               |filter_layer4
512               |256               |filter_layer5
0.2               |0.4               |convolution_dropout
0.001             |0.0001            |learning_rate
2                 |2                 |tuner/epochs
0                 |0                 |tuner/initial_epoch
2                 |2                 |tuner/bracket
0                 |0                 |tuner/round

Epoch 1/2
122/122 [==============================] - 90s 720ms/step - loss: 1.1538 - cat_accuracy: 0.5269 - precision: 0.5461 - recall: 0.4723 - val_loss: 1.1022 - val_cat_accuracy: 0.3333 - val_precision: 0.0000e+00 - val_recall: 0.0000e+00

It seems that val_cat_accuracy is stuck around 0.33(3), and the val_precision and val_recall are either 0 or 0.33(3) as well.

Surprisingly, when I built initial model (with some randomly chosen kernel sizes, number of filters, and dropout rates), it was trained within 36 epochs with val_cat_accuracy: 0.7348: