仅使用预训练网络中的某些层进行迁移学习

发布于 2025-01-20 08:55:07 字数 2991 浏览 5 评论 0 原文

我正在建立一个人脸部分割到皮肤和非皮肤区域的模型。作为模型，我正在使用显示的模型/方法在这里末端的密集层，随着乙状结激活。该模型适合我的目的很好，给出了良好的骰子度量得分。该型号使用RESNET50的2个预训练层用作模型骨干用于特征检测。我已经阅读了几篇文章，书籍和代码，但找不到有关如何确定要选择哪个层进行特征提取的任何信息。我将RESNET50体系结构与Xception进行了比较，并拾取了两个类似的层，取代了原始网络中的层（）并进行了培训。我得到了类似的结果，不要更好。我有以下问题

如何确定哪个层负责低级/高级功能？
在训练时间和可训练参数的数量方面，仅使用预训练的层比使用完整的预训练网络更好吗？
在哪里可以找到有关仅使用预训练网络层的更多信息？

这是快速审视的代码

def DeeplabV3Plus(image_size, num_classes):
    model_input = keras.Input(shape=(image_size, image_size, 3))
    resnet50 = keras.applications.ResNet50(
        weights="imagenet", include_top=False, input_tensor=model_input)
    x = resnet50.get_layer("conv4_block6_2_relu").output
    x = DilatedSpatialPyramidPooling(x)

    input_a = layers.UpSampling2D(size=(image_size // 4 // x.shape[1], image_size // 4 // x.shape[2]), interpolation="bilinear")(x)
    input_b = resnet50.get_layer("conv2_block3_2_relu").output
    input_b = convolution_block(input_b, num_filters=48, kernel_size=1)

    x = layers.Concatenate(axis=-1)([input_a, input_b])
    x = convolution_block(x)
    x = convolution_block(x)
    x = layers.UpSampling2D(size=(image_size // x.shape[1], image_size // x.shape[2]), interpolation="bilinear")(x)
    model_output = layers.Conv2D(num_classes, kernel_size=(1, 1), padding="same")(x)
    return keras.Model(inputs=model_input, outputs=model_output)

，这是我使用Xception层作为骨干的修改代码，

def DeeplabV3Plus(image_size, num_classes):
  model_input = keras.Input(shape=(image_size, image_size, 3))
 
  Xception_model = keras.applications.Xception(
              weights="imagenet", include_top=False, input_tensor=model_input)
  xception_x1 = Xception_model.get_layer("block9_sepconv3_act").output
  x = DilatedSpatialPyramidPooling(xception_x1)

  input_a = layers.UpSampling2D(size=(image_size // 4 // x.shape[1], image_size // 4 // x.shape[2]), interpolation="bilinear")(x)
  input_a = layers.AveragePooling2D(pool_size=(2, 2))(input_a)
  xception_x2 = Xception_model.get_layer("block4_sepconv1_act").output
  input_b = convolution_block(xception_x2, num_filters=256, kernel_size=1)

  x = layers.Concatenate(axis=-1)([input_a, input_b])
  x = convolution_block(x)
  x = convolution_block(x)
  x = layers.UpSampling2D(size=(image_size // x.shape[1], image_size // x.shape[2]),interpolation="bilinear")(x)
  x = layers.Conv2D(num_classes, kernel_size=(1, 1), padding="same")(x)
  model_output = layers.Dense(x.shape[2], activation='sigmoid')(x)
  return keras.Model(inputs=model_input, outputs=model_output)

谢谢！

原文

I am building a model for human face segmentation into skin and non-skin area. As a model, I am using the model/method shown here as a starting point and adding a dense layer at the end with sigmoid activation. The model works very well for my purpose, giving good dice metric score. The model uses 2 pre-trained layers from Resnet50 as a model backbone for feature detection. I have read several articles, books and code but couldn't find any information on how to determine which layer to choses for feature extraction.
I compared the Resnet50 architecture with Xception and picked up two similar layers, replaced the layer in the original network (here) and ran the training. I got similar results, not better not worse.
I have the following questions

How to determine which layer is responsible for low-level/high-level features?
Does using only pre-trained layers any better than using full pre-trained networks in terms of training time and the number of trainable parameters?
where can I find more information about using only layers from pre-trained networks?

here is the code for quick over-view

def DeeplabV3Plus(image_size, num_classes):
    model_input = keras.Input(shape=(image_size, image_size, 3))
    resnet50 = keras.applications.ResNet50(
        weights="imagenet", include_top=False, input_tensor=model_input)
    x = resnet50.get_layer("conv4_block6_2_relu").output
    x = DilatedSpatialPyramidPooling(x)

    input_a = layers.UpSampling2D(size=(image_size // 4 // x.shape[1], image_size // 4 // x.shape[2]), interpolation="bilinear")(x)
    input_b = resnet50.get_layer("conv2_block3_2_relu").output
    input_b = convolution_block(input_b, num_filters=48, kernel_size=1)

    x = layers.Concatenate(axis=-1)([input_a, input_b])
    x = convolution_block(x)
    x = convolution_block(x)
    x = layers.UpSampling2D(size=(image_size // x.shape[1], image_size // x.shape[2]), interpolation="bilinear")(x)
    model_output = layers.Conv2D(num_classes, kernel_size=(1, 1), padding="same")(x)
    return keras.Model(inputs=model_input, outputs=model_output)

And here is my modified code using Xception layers as the backbone

def DeeplabV3Plus(image_size, num_classes):
  model_input = keras.Input(shape=(image_size, image_size, 3))
 
  Xception_model = keras.applications.Xception(
              weights="imagenet", include_top=False, input_tensor=model_input)
  xception_x1 = Xception_model.get_layer("block9_sepconv3_act").output
  x = DilatedSpatialPyramidPooling(xception_x1)

  input_a = layers.UpSampling2D(size=(image_size // 4 // x.shape[1], image_size // 4 // x.shape[2]), interpolation="bilinear")(x)
  input_a = layers.AveragePooling2D(pool_size=(2, 2))(input_a)
  xception_x2 = Xception_model.get_layer("block4_sepconv1_act").output
  input_b = convolution_block(xception_x2, num_filters=256, kernel_size=1)

  x = layers.Concatenate(axis=-1)([input_a, input_b])
  x = convolution_block(x)
  x = convolution_block(x)
  x = layers.UpSampling2D(size=(image_size // x.shape[1], image_size // x.shape[2]),interpolation="bilinear")(x)
  x = layers.Conv2D(num_classes, kernel_size=(1, 1), padding="same")(x)
  model_output = layers.Dense(x.shape[2], activation='sigmoid')(x)
  return keras.Model(inputs=model_input, outputs=model_output)

Thanks in advance!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

套路撩心 2025-01-27 08:55:07

通常，第一层（更靠近输入的层）是负责学习高级功能的层，而最后一层是数据集/任务特定的。这就是为什么当转移学习时，您通常只想删除最后几层以将它们替换为可以解决您的特定问题的其他层。
这取决于。传输整个网络，而不会删除或添加任何新层，基本上意味着网络不会学习任何新的东西（除非您不冻结图层 - 在这种情况下，您是在微调）。另一方面，如果您删除一些图层并添加了几层，那么可训练参数的数量仅取决于您刚刚添加的新图层。

我建议您要做的是：