MXNET-如何将辍学层添加到RESNET_V1预验证的模型

发布于 2025-01-30 11:49:11 字数 2978 浏览 4 评论 0原文

我试图在MXNET:RESNET50_V1中验证一个预算的模型。 该模型没有辍学,我想添加它以避免过度拟合,并使其看起来与i3d_resnet50_v1_kinetics400的最后一层相似。 我尝试执行以下操作,但是当训练时会遇到错误:

原始网络的最后一层(resnet50_v1):

...
(8): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
  (output): Dense(2048 -> 1000, linear)

我的尝试:

    classes = 2
    model_name = 'ResNet50_v1'
    finetune_net = get_model(model_name, pretrained=True)

    with finetune_net.name_scope():
       finetune_net.output = nn.Dense(2048, in_units=2048)
       finetune_net.head = nn.HybridSequential()
       finetune_net.head.add(nn.Dropout(0.95))
       finetune_net.head.add(nn.Dense(2, in_units=2048))
       finetune_net.fc = nn.Dense(2, in_units=2048)

    finetune_net.output.initialize(init.Xavier(), ctx = ctx)
    finetune_net.head.initialize(init.Xavier(), ctx = ctx)
    finetune_net.fc.initialize(init.Xavier(), ctx = ctx)
    finetune_net.collect_params().reset_ctx(ctx)
    finetune_net.hybridize()

修改后的网络的最后一层( RESNET50_V1):

...
(8): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
(output): Dense(2048 -> 2048, linear)
  (head): HybridSequential(
    (0): Dropout(p = 0.95, axes=())
    (1): Dense(2048 -> 2, linear)
  )
  (fc): Dense(2048 -> 2, linear)
)

i3d_resnet50_v1_kinetics400的最后一层:

...## Heading ## 
(st_avg): GlobalAvgPool3D(size=(1, 1, 1), stride=(1, 1, 1), padding=(0, 0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCDHW)
    (head): HybridSequential(
      (0): Dropout(p = 0.8, axes=())
      (1): Dense(2048 -> 2, linear)
    )
    (fc): Dense(2048 -> 2, linear)

这就是训练网络的修改网络的偏见,

Parameter resnetv10_dense1_weight (shape=(2048, 2048), dtype=float32) write
Parameter resnetv10_dense1_bias (shape=(2048,), dtype=float32) write
Parameter resnetv10_dense2_weight (shape=(2, 2048), dtype=float32) write
Parameter resnetv10_dense2_bias (shape=(2,), dtype=float32) write
Parameter resnetv10_dense3_weight (shape=(2, 2048), dtype=float32) write
Parameter resnetv10_dense3_bias (shape=(2,), dtype=float32) write

训练时的错误:

/usr/usr/ local/lib/python3.7/dist-packages/mxnet/gluon/block.py:825:userWarning:参数resnetv10_dense3_bias,resnetv10_dense3_weight,resnetv10_dense2_dense2_bias,resnetv10_dense2_dense2_dense2_dense2_dense2_dense2_densens2_dense2_densens2_dense2_weight不使用任何计算。这是打算的吗? out = self.forward(*args)

用户遵守:参数的梯度resnetv10_dense2_bias上下文gpu(0)上没有从上次step向后更新。这可能意味着模型中的一个错误,使其仅使用此迭代的参数(块)子集。如果您仅故意使用子集,请使用ignore_stale_grad = true调用步骤,以抑制此警告并跳过使用Stale梯度密度

浓密2和密度3的参数更新,我添加了作为新密集层的添加的参数。 dense1已经在模型中,我刚刚将产量从1000更改为2048年。

任何帮助都会非常感谢,因为我很困惑...

I am trying to finetune a pretrained model in mxnet: ResNet50_v1.
This model does not have dropout and I would like to add it to avoid overfitting and make it look similar to the last layers of I3D_Resnet50_v1_Kinetics400.
I tried to do the following but when training I get an error:

Last layers of original network (ResNet50_v1):

...
(8): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
  (output): Dense(2048 -> 1000, linear)

My attempt:

    classes = 2
    model_name = 'ResNet50_v1'
    finetune_net = get_model(model_name, pretrained=True)

    with finetune_net.name_scope():
       finetune_net.output = nn.Dense(2048, in_units=2048)
       finetune_net.head = nn.HybridSequential()
       finetune_net.head.add(nn.Dropout(0.95))
       finetune_net.head.add(nn.Dense(2, in_units=2048))
       finetune_net.fc = nn.Dense(2, in_units=2048)

    finetune_net.output.initialize(init.Xavier(), ctx = ctx)
    finetune_net.head.initialize(init.Xavier(), ctx = ctx)
    finetune_net.fc.initialize(init.Xavier(), ctx = ctx)
    finetune_net.collect_params().reset_ctx(ctx)
    finetune_net.hybridize()

Last layers of the modified network (ResNet50_v1):

...
(8): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
(output): Dense(2048 -> 2048, linear)
  (head): HybridSequential(
    (0): Dropout(p = 0.95, axes=())
    (1): Dense(2048 -> 2, linear)
  )
  (fc): Dense(2048 -> 2, linear)
)

Last layers of I3D_Resnet50_v1_Kinetics400:

...## Heading ## 
(st_avg): GlobalAvgPool3D(size=(1, 1, 1), stride=(1, 1, 1), padding=(0, 0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCDHW)
    (head): HybridSequential(
      (0): Dropout(p = 0.8, axes=())
      (1): Dense(2048 -> 2, linear)
    )
    (fc): Dense(2048 -> 2, linear)

This is what params of the modifies network look like

Parameter resnetv10_dense1_weight (shape=(2048, 2048), dtype=float32) write
Parameter resnetv10_dense1_bias (shape=(2048,), dtype=float32) write
Parameter resnetv10_dense2_weight (shape=(2, 2048), dtype=float32) write
Parameter resnetv10_dense2_bias (shape=(2,), dtype=float32) write
Parameter resnetv10_dense3_weight (shape=(2, 2048), dtype=float32) write
Parameter resnetv10_dense3_bias (shape=(2,), dtype=float32) write

Error when training:

/usr/local/lib/python3.7/dist-packages/mxnet/gluon/block.py:825: UserWarning: Parameter resnetv10_dense3_bias, resnetv10_dense3_weight, resnetv10_dense2_bias, resnetv10_dense2_weight is not used by any computation. Is this intended?
out = self.forward(*args)

UserWarning: Gradient of Parameter resnetv10_dense2_bias on context gpu(0) has not been updated by backward since last step. This could mean a bug in your model that made it only use a subset of the Parameters (Blocks) for this iteration. If you are intentionally only using a subset, call step with ignore_stale_grad=True to suppress this warning and skip updating of Parameters with stale gradient

dense2 and dense3, the ones I have added as new dense layers are not being updated.
dense1 was already in the model, I just changed the output from 1000 to 2048.

Any help woul be very much appreciated as I am quite stuck ...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冷…雨湿花 2025-02-06 11:49:11

由于您将新层分配给模型,因此您应该重新进化hybrid_forward(或forward)方法,以将它们包含在计算中:

from mxnet.gluon import nn
from mxnet.init import Xavier
from mxnet.gluon.block import HybridBlock

class MyResNet(HybridBlock):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.finetune_net = get_model('ResNet50_v1', pretrained=True)
        self.finetune_net.output = nn.Dense(2048, in_units=2048)
        self.head = nn.HybridSequential()
        self.head.add(nn.Dropout(0.95))
        self.head.add(nn.Dense(2, in_units=2048))
        self.fc = nn.Dense(2, in_units=2048)

    def hybrid_forward(self, F, x):
        x = self.finetune_net(x)
        x = self.head(x)
        x = self.fc(x)
        return x

    def initialize_outputs(self):
        self.finetune_net.output.initialize(init=Xavier())
        self.head.initialize(init=Xavier())
        self.fc.initialize(init=Xavier())

my_resnet = MyResNet()
my_resnet.initialize_outputs()
my_resnet(x)

Since you assign new layers to the model, you should reimplement hybrid_forward (or forward) method to include them in computations:

from mxnet.gluon import nn
from mxnet.init import Xavier
from mxnet.gluon.block import HybridBlock

class MyResNet(HybridBlock):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.finetune_net = get_model('ResNet50_v1', pretrained=True)
        self.finetune_net.output = nn.Dense(2048, in_units=2048)
        self.head = nn.HybridSequential()
        self.head.add(nn.Dropout(0.95))
        self.head.add(nn.Dense(2, in_units=2048))
        self.fc = nn.Dense(2, in_units=2048)

    def hybrid_forward(self, F, x):
        x = self.finetune_net(x)
        x = self.head(x)
        x = self.fc(x)
        return x

    def initialize_outputs(self):
        self.finetune_net.output.initialize(init=Xavier())
        self.head.initialize(init=Xavier())
        self.fc.initialize(init=Xavier())

my_resnet = MyResNet()
my_resnet.initialize_outputs()
my_resnet(x)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文