MobileVit二进制分类值:`logits'和`labels'必须具有相同的形状,接收到((无,2)vs(none,1))

发布于 2025-02-06 23:19:37 字数 4557 浏览 2 评论 0原文

我正在使用COLAB笔记本( https://colab.research.google.com/github/keras/keras-team/keras-io/keras-io/blob/blob/master/master/exampleas/vision/vision/visionnb/mobilevit.ipynb ) 要在我拥有的25k 25k图片的数据集上训练。由于它是二进制分类,所以我使用了 keras.losses.binarycrossentropy sigmoid作为激活功能在最后一层: -

def create_mobilevit(num_classes=2):
inputs = keras.Input((image_size, image_size, 3))
x = layers.Rescaling(scale=1.0 / 255)(inputs)

# Initial conv-stem -> MV2 block.
x = conv_block(x, filters=16)
x = inverted_residual_block(
    x, expanded_channels=16 * expansion_factor, output_channels=16
)

# Downsampling with MV2 block.
x = inverted_residual_block(
    x, expanded_channels=16 * expansion_factor, output_channels=24, strides=2
)
x = inverted_residual_block(
    x, expanded_channels=24 * expansion_factor, output_channels=24
)
x = inverted_residual_block(
    x, expanded_channels=24 * expansion_factor, output_channels=24
)

# First MV2 -> MobileViT block.
x = inverted_residual_block(
    x, expanded_channels=24 * expansion_factor, output_channels=48, strides=2
)
x = mobilevit_block(x, num_blocks=2, projection_dim=64)

# Second MV2 -> MobileViT block.
x = inverted_residual_block(
    x, expanded_channels=64 * expansion_factor, output_channels=64, strides=2
)
x = mobilevit_block(x, num_blocks=4, projection_dim=80)

# Third MV2 -> MobileViT block.
x = inverted_residual_block(
    x, expanded_channels=80 * expansion_factor, output_channels=80, strides=2
)
x = mobilevit_block(x, num_blocks=3, projection_dim=96)
x = conv_block(x, filters=320, kernel_size=1, strides=1)

# Classification head.
x = layers.GlobalAvgPool2D()(x)
outputs = layers.Dense(num_classes, activation="sigmoid")(x)

return keras.Model(inputs, outputs)

这是我的数据集 :准备单元格: -

batch_size = 64
auto = tf.data.AUTOTUNE
resize_bigger = 512
num_classes = 2


def preprocess_dataset(is_training=True):
    def _pp(image, label):
        if is_training:
            # Resize to a bigger spatial resolution and take the random
            # crops.
            image = tf.image.resize(image, (resize_bigger, resize_bigger))
            image = tf.image.random_crop(image, (image_size, image_size, 3))
            image = tf.image.random_flip_left_right(image)
        else:
            image = tf.image.resize(image, (image_size, image_size))
        label = tf.one_hot(label, depth=num_classes)
        return image, label

    return _pp


def prepare_dataset(dataset, is_training=True):
    if is_training:
        dataset = dataset.shuffle(batch_size * 10)
    dataset = dataset.map(preprocess_dataset(is_training), num_parallel_calls=auto)
    return dataset.batch(batch_size).prefetch(auto)

这是训练模型的单元格: -

learning_rate = 0.002
label_smoothing_factor = 0.1
epochs = 30

optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
loss_fn = keras.losses.BinaryCrossentropy(label_smoothing=label_smoothing_factor)


def run_experiment(epochs=epochs):
    mobilevit_xxs = create_mobilevit(num_classes=num_classes)
    mobilevit_xxs.compile(optimizer=optimizer, loss=loss_fn, metrics=["accuracy"])

    checkpoint_filepath = "/tmp/checkpoint"
    checkpoint_callback = keras.callbacks.ModelCheckpoint(
        checkpoint_filepath,
        monitor="val_accuracy",
        save_best_only=True,
        save_weights_only=True,
    )

    mobilevit_xxs.fit(
        train_ds,
        validation_data=val_ds,
        epochs=epochs,
        callbacks=[checkpoint_callback],
    )
    mobilevit_xxs.load_weights(checkpoint_filepath)
    _, accuracy = mobilevit_xxs.evaluate(val_ds)
    print(f"Validation accuracy: {round(accuracy * 100, 2)}%")
    return mobilevit_xxs


mobilevit_xxs = run_experiment()

基本上代码与 https://colab.research.google.com/github/github/keras/keras-team/keras-io/blob/blob/blob/master/master/master/examples/examples/examples/vision/ipynb /mobilevit.ipynb 除了二进制互苯甲酸损失和sigmoid的变化为ACTV。功能。我不明白为什么我明确地编码了我的班级标签,我也不明白我会得到这个 -

ValueError: `logits` and `labels` must have the same shape, received ((None, 2) vs (None, 1)).

I am using the colab notebook(https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/vision/ipynb/mobilevit.ipynb) for mobilevit to train on a dataset I have of 25k pictures for 2 classes. Since it's a binary classification, I have used keras.losses.BinaryCrossentropy and Sigmoid as activation function at the last layer:-

def create_mobilevit(num_classes=2):
inputs = keras.Input((image_size, image_size, 3))
x = layers.Rescaling(scale=1.0 / 255)(inputs)

# Initial conv-stem -> MV2 block.
x = conv_block(x, filters=16)
x = inverted_residual_block(
    x, expanded_channels=16 * expansion_factor, output_channels=16
)

# Downsampling with MV2 block.
x = inverted_residual_block(
    x, expanded_channels=16 * expansion_factor, output_channels=24, strides=2
)
x = inverted_residual_block(
    x, expanded_channels=24 * expansion_factor, output_channels=24
)
x = inverted_residual_block(
    x, expanded_channels=24 * expansion_factor, output_channels=24
)

# First MV2 -> MobileViT block.
x = inverted_residual_block(
    x, expanded_channels=24 * expansion_factor, output_channels=48, strides=2
)
x = mobilevit_block(x, num_blocks=2, projection_dim=64)

# Second MV2 -> MobileViT block.
x = inverted_residual_block(
    x, expanded_channels=64 * expansion_factor, output_channels=64, strides=2
)
x = mobilevit_block(x, num_blocks=4, projection_dim=80)

# Third MV2 -> MobileViT block.
x = inverted_residual_block(
    x, expanded_channels=80 * expansion_factor, output_channels=80, strides=2
)
x = mobilevit_block(x, num_blocks=3, projection_dim=96)
x = conv_block(x, filters=320, kernel_size=1, strides=1)

# Classification head.
x = layers.GlobalAvgPool2D()(x)
outputs = layers.Dense(num_classes, activation="sigmoid")(x)

return keras.Model(inputs, outputs)

And here's my dataset preparation cell:-

batch_size = 64
auto = tf.data.AUTOTUNE
resize_bigger = 512
num_classes = 2


def preprocess_dataset(is_training=True):
    def _pp(image, label):
        if is_training:
            # Resize to a bigger spatial resolution and take the random
            # crops.
            image = tf.image.resize(image, (resize_bigger, resize_bigger))
            image = tf.image.random_crop(image, (image_size, image_size, 3))
            image = tf.image.random_flip_left_right(image)
        else:
            image = tf.image.resize(image, (image_size, image_size))
        label = tf.one_hot(label, depth=num_classes)
        return image, label

    return _pp


def prepare_dataset(dataset, is_training=True):
    if is_training:
        dataset = dataset.shuffle(batch_size * 10)
    dataset = dataset.map(preprocess_dataset(is_training), num_parallel_calls=auto)
    return dataset.batch(batch_size).prefetch(auto)

And this is the cell for training the model:-

learning_rate = 0.002
label_smoothing_factor = 0.1
epochs = 30

optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
loss_fn = keras.losses.BinaryCrossentropy(label_smoothing=label_smoothing_factor)


def run_experiment(epochs=epochs):
    mobilevit_xxs = create_mobilevit(num_classes=num_classes)
    mobilevit_xxs.compile(optimizer=optimizer, loss=loss_fn, metrics=["accuracy"])

    checkpoint_filepath = "/tmp/checkpoint"
    checkpoint_callback = keras.callbacks.ModelCheckpoint(
        checkpoint_filepath,
        monitor="val_accuracy",
        save_best_only=True,
        save_weights_only=True,
    )

    mobilevit_xxs.fit(
        train_ds,
        validation_data=val_ds,
        epochs=epochs,
        callbacks=[checkpoint_callback],
    )
    mobilevit_xxs.load_weights(checkpoint_filepath)
    _, accuracy = mobilevit_xxs.evaluate(val_ds)
    print(f"Validation accuracy: {round(accuracy * 100, 2)}%")
    return mobilevit_xxs


mobilevit_xxs = run_experiment()

Basically the code is identical to https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/vision/ipynb/mobilevit.ipynb except for the change in BinaryCrossEntropy loss and Sigmoid as actv. func. I don't understand why I am getting this even though I am explicitly ont-hot-coded my class labels -

ValueError: `logits` and `labels` must have the same shape, received ((None, 2) vs (None, 1)).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

倾听心声的旋律 2025-02-13 23:19:37

您需要更改num_classes = 1而不是num_classes = 2,因为您已经使用了 sigmoid 激活函数,该功能返回二进制分类的0到1之间的值(0,1)。

值< 0.5将被视为0类,并且值> 0.5将在两个二进制类(0,1)之间为1类。

请参阅复制的

You need to change the num_classes = 1 instead of num_classes = 2 as you have used Sigmoid activation function which returns the values between 0 to 1 for binary classification(0,1).

The values <0.5 will be considered as class 0 and values >0.5 will be as class 1 in between two binary classes (0,1).

Please refer to the replicated gist for your reference.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文