仅使用预训练网络中的某些层进行迁移学习

发布于 2025-01-20 08:55:07 字数 2991 浏览 5 评论 0 原文

我正在建立一个人脸部分割到皮肤和非皮肤区域的模型。作为模型,我正在使用显示的模型/方法在这里末端的密集层,随着乙状结激活。该模型适合我的目的很好,给出了良好的骰子度量得分。该型号使用RESNET50的2个预训练层用作模型骨干用于特征检测。我已经阅读了几篇文章,书籍和代码,但找不到有关如何确定要选择哪个层进行特征提取的任何信息。 我将RESNET50体系结构与Xception进行了比较,并拾取了两个类似的层,取代了原始网络中的层()并进行了培训。我得到了类似的结果,不要更好。 我有以下问题

  1. 如何确定哪个层负责低级/高级功能?
  2. 在训练时间和可训练参数的数量方面,仅使用预训练的层比使用完整的预训练网络更好吗?
  3. 在哪里可以找到有关仅使用预训练网络层的更多信息?

这是快速审视的代码

def DeeplabV3Plus(image_size, num_classes):
    model_input = keras.Input(shape=(image_size, image_size, 3))
    resnet50 = keras.applications.ResNet50(
        weights="imagenet", include_top=False, input_tensor=model_input)
    x = resnet50.get_layer("conv4_block6_2_relu").output
    x = DilatedSpatialPyramidPooling(x)

    input_a = layers.UpSampling2D(size=(image_size // 4 // x.shape[1], image_size // 4 // x.shape[2]), interpolation="bilinear")(x)
    input_b = resnet50.get_layer("conv2_block3_2_relu").output
    input_b = convolution_block(input_b, num_filters=48, kernel_size=1)

    x = layers.Concatenate(axis=-1)([input_a, input_b])
    x = convolution_block(x)
    x = convolution_block(x)
    x = layers.UpSampling2D(size=(image_size // x.shape[1], image_size // x.shape[2]), interpolation="bilinear")(x)
    model_output = layers.Conv2D(num_classes, kernel_size=(1, 1), padding="same")(x)
    return keras.Model(inputs=model_input, outputs=model_output)

,这是我使用Xception层作为骨干的修改代码,

def DeeplabV3Plus(image_size, num_classes):
  model_input = keras.Input(shape=(image_size, image_size, 3))
 
  Xception_model = keras.applications.Xception(
              weights="imagenet", include_top=False, input_tensor=model_input)
  xception_x1 = Xception_model.get_layer("block9_sepconv3_act").output
  x = DilatedSpatialPyramidPooling(xception_x1)

  input_a = layers.UpSampling2D(size=(image_size // 4 // x.shape[1], image_size // 4 // x.shape[2]), interpolation="bilinear")(x)
  input_a = layers.AveragePooling2D(pool_size=(2, 2))(input_a)
  xception_x2 = Xception_model.get_layer("block4_sepconv1_act").output
  input_b = convolution_block(xception_x2, num_filters=256, kernel_size=1)

  x = layers.Concatenate(axis=-1)([input_a, input_b])
  x = convolution_block(x)
  x = convolution_block(x)
  x = layers.UpSampling2D(size=(image_size // x.shape[1], image_size // x.shape[2]),interpolation="bilinear")(x)
  x = layers.Conv2D(num_classes, kernel_size=(1, 1), padding="same")(x)
  model_output = layers.Dense(x.shape[2], activation='sigmoid')(x)
  return keras.Model(inputs=model_input, outputs=model_output)

谢谢!

I am building a model for human face segmentation into skin and non-skin area. As a model, I am using the model/method shown here as a starting point and adding a dense layer at the end with sigmoid activation. The model works very well for my purpose, giving good dice metric score. The model uses 2 pre-trained layers from Resnet50 as a model backbone for feature detection. I have read several articles, books and code but couldn't find any information on how to determine which layer to choses for feature extraction.
I compared the Resnet50 architecture with Xception and picked up two similar layers, replaced the layer in the original network (here) and ran the training. I got similar results, not better not worse.
I have the following questions

  1. How to determine which layer is responsible for low-level/high-level features?
  2. Does using only pre-trained layers any better than using full pre-trained networks in terms of training time and the number of trainable parameters?
  3. where can I find more information about using only layers from pre-trained networks?

here is the code for quick over-view

def DeeplabV3Plus(image_size, num_classes):
    model_input = keras.Input(shape=(image_size, image_size, 3))
    resnet50 = keras.applications.ResNet50(
        weights="imagenet", include_top=False, input_tensor=model_input)
    x = resnet50.get_layer("conv4_block6_2_relu").output
    x = DilatedSpatialPyramidPooling(x)

    input_a = layers.UpSampling2D(size=(image_size // 4 // x.shape[1], image_size // 4 // x.shape[2]), interpolation="bilinear")(x)
    input_b = resnet50.get_layer("conv2_block3_2_relu").output
    input_b = convolution_block(input_b, num_filters=48, kernel_size=1)

    x = layers.Concatenate(axis=-1)([input_a, input_b])
    x = convolution_block(x)
    x = convolution_block(x)
    x = layers.UpSampling2D(size=(image_size // x.shape[1], image_size // x.shape[2]), interpolation="bilinear")(x)
    model_output = layers.Conv2D(num_classes, kernel_size=(1, 1), padding="same")(x)
    return keras.Model(inputs=model_input, outputs=model_output)

And here is my modified code using Xception layers as the backbone

def DeeplabV3Plus(image_size, num_classes):
  model_input = keras.Input(shape=(image_size, image_size, 3))
 
  Xception_model = keras.applications.Xception(
              weights="imagenet", include_top=False, input_tensor=model_input)
  xception_x1 = Xception_model.get_layer("block9_sepconv3_act").output
  x = DilatedSpatialPyramidPooling(xception_x1)

  input_a = layers.UpSampling2D(size=(image_size // 4 // x.shape[1], image_size // 4 // x.shape[2]), interpolation="bilinear")(x)
  input_a = layers.AveragePooling2D(pool_size=(2, 2))(input_a)
  xception_x2 = Xception_model.get_layer("block4_sepconv1_act").output
  input_b = convolution_block(xception_x2, num_filters=256, kernel_size=1)

  x = layers.Concatenate(axis=-1)([input_a, input_b])
  x = convolution_block(x)
  x = convolution_block(x)
  x = layers.UpSampling2D(size=(image_size // x.shape[1], image_size // x.shape[2]),interpolation="bilinear")(x)
  x = layers.Conv2D(num_classes, kernel_size=(1, 1), padding="same")(x)
  model_output = layers.Dense(x.shape[2], activation='sigmoid')(x)
  return keras.Model(inputs=model_input, outputs=model_output)

Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

套路撩心 2025-01-27 08:55:07
  1. 通常,第一层(更靠近输入的层)是负责学习高级功能的层,而最后一层是数据集/任务特定的。这就是为什么当转移学习时,您通常只想删除最后几层以将它们替换为可以解决您的特定问题的其他层。
  2. 这取决于。传输整个网络,而不会删除或添加任何新层,基本上意味着网络不会学习任何新的东西(除非您不冻结图层 - 在这种情况下,您是在微调)。另一方面,如果您删除一些图层并添加了几层,那么可训练参数的数量仅取决于您刚刚添加的新图层。

我建议您要做的是:

  1. 从预先训练的网络中删除几层,冻结这些层,并增加几层(甚至只有一层),
  2. 以一定的学习率训练新网络(通常此学习率不是很好低)
  3. 微调!:解冻所有层,降低学习率并重新培训整个网络
  1. In general, the first layers (the ones closer to the input) are the one responsible for learning high-level features, whereas the last layers are more dataset/task-specific. This is the reason why, when transfer learning, you usually want to delete only the last few layers to replace them with others which can deal with your specific problem.
  2. It depends. Transfering the whole network, without deleting nor adding any new layer, basically means that the network won't learn anything new (unless you are not freezing the layers - in that case you are fine tuning). On the other hand, if you delete some layers and add a few more, than you the number of trainable parameters only depend on the new layers you just added.

What I suggest you to do is:

  1. Delete a few layers from a pre-trained network, freeze these layers and add a few more layers (even just one)
  2. Train the new network with a certain learning rate (usually this learning rate is not very low)
  3. Fine Tune!: unfreeze all the layers, lower the learning rate, and re-train the whole network
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文