如何在推理模式下设置基本模型?
keras 文档关于精细的togning 关于精细的说明,对“保持batchnormization对”很重要通过传递训练= false调用基本模型时,在推理模式下进行层。”。 (有趣的是,我发现有关该主题的每个非官方示例都忽略了此设置。)
文档在示例中跟进:
from tensorflow import keras
from keras.applications.xception import Xception
base_model = keras.applications.Xception(
weights='imagenet', # Load weights pre-trained on ImageNet.
input_shape=(150, 150, 3),
include_top=False) # Do not include the ImageNet classifier at the top.
base_model.trainable = False
inputs = keras.Input(shape=(150, 150, 3))
scale_layer = keras.layers.Rescaling(scale=1 / 127.5, offset=-1)
x = scale_layer(x)
# We make sure that the base_model is running in inference mode here,
# by passing `training=False`. This is important for fine-tuning, as you will
# learn in a few paragraphs.
x = base_model(x, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs , outputs)
事实是,示例是将预处理添加到基本模型,而我的模型(EdgitionNetB3)具有已经包括预处理,我不知道如何在没有其他图层的情况下使用``training = false''设置base_model:
base_model = EfficientNetB3(weights='imagenet', include_top=False, input_shape=input_shape)
base_model.trainable=False
model = Sequential()
model.add(base_model) # How to set base_model training=False?
model.add(GlobalAveragePooling2D())
model.add(Dropout(0.2))
model.add(Dense(10, activation="softmax", name="classifier"))
如何证明triaght = false
或培训= true
具有效果:
@frightera向我解释了如何“锁定”模型的状态,我想向自己证明锁定是通过检查batchnoralization nortarable nortableable变量而发生的。我低估的是,如果我用triench = true
调用模型,则应更新变量。但是,情况并非如此,还是我缺少什么?
import tensorflow as tf
from tensorflow import keras
from keras.applications.efficientnet import EfficientNetB3
import numpy as np
class WrappedEffNet(keras.layers.Layer):
def __init__(self, **kwargs):
super(WrappedEffNet, self).__init__(**kwargs)
self.model = EfficientNetB3(weights='imagenet',
include_top=False,
input_shape=(224, 224, 3))
self.model.trainable=False
def call(self, x, training=False):
return self.model(x, training=training) # Modified to pass also True.
base_model_wrapped = WrappedEffNet()
random_vector = tf.random.uniform((1, 224, 224, 3))
o1 = base_model_wrapped(random_vector)
o2 = base_model_wrapped(random_vector, training = False)
# Getting all non-trainable variable values from all BatchNormalization layers.
array_a = np.array([])
for layer in base_model_wrapped.model.layers:
if hasattr(layer, 'moving_mean'):
v = layer.moving_mean.numpy()
np.concatenate([array_a, v])
v = layer.moving_variance.numpy()
np.concatenate([array_a, v])
o3 = base_model_wrapped(random_vector, training = True) # Changing to True, shouldn't this update BatchNormalization non-trainable variables?
array_b = np.array([])
for layer in base_model_wrapped.model.layers:
if hasattr(layer, 'moving_mean'):
v = layer.moving_mean.numpy()
np.concatenate([array_b, v])
v = layer.moving_variance.numpy()
np.concatenate([array_b, v])
print(np.allclose(array_a, array_b)) # Shouldn't this be False?
Keras documentation about fine-tuning states that it is important to "keep the BatchNormalization layers in inference mode by passing training=False when calling the base model.". (What is interesting, that every non-official example that I've found about the topic ignores this setting.)
Documentation follows up with example:
from tensorflow import keras
from keras.applications.xception import Xception
base_model = keras.applications.Xception(
weights='imagenet', # Load weights pre-trained on ImageNet.
input_shape=(150, 150, 3),
include_top=False) # Do not include the ImageNet classifier at the top.
base_model.trainable = False
inputs = keras.Input(shape=(150, 150, 3))
scale_layer = keras.layers.Rescaling(scale=1 / 127.5, offset=-1)
x = scale_layer(x)
# We make sure that the base_model is running in inference mode here,
# by passing `training=False`. This is important for fine-tuning, as you will
# learn in a few paragraphs.
x = base_model(x, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs , outputs)
The thing is that the example is adding preprocessing to the base model and my model(EfficientNetB3) has already preprocessing included and I don't know how to set my base_model with `training=False`` without prepending it with additional layer:
base_model = EfficientNetB3(weights='imagenet', include_top=False, input_shape=input_shape)
base_model.trainable=False
model = Sequential()
model.add(base_model) # How to set base_model training=False?
model.add(GlobalAveragePooling2D())
model.add(Dropout(0.2))
model.add(Dense(10, activation="softmax", name="classifier"))
How to prove that training=False
or training=True
has an effect:
@Frightera explained to me how to "lock" the model's state and I wanted to prove to myself that the lock happens by checking BatchNormalization non-trainable variables. My understating is that if I call model with training=True
then it should update the variables. However, this is not the case, or am I missing something?
import tensorflow as tf
from tensorflow import keras
from keras.applications.efficientnet import EfficientNetB3
import numpy as np
class WrappedEffNet(keras.layers.Layer):
def __init__(self, **kwargs):
super(WrappedEffNet, self).__init__(**kwargs)
self.model = EfficientNetB3(weights='imagenet',
include_top=False,
input_shape=(224, 224, 3))
self.model.trainable=False
def call(self, x, training=False):
return self.model(x, training=training) # Modified to pass also True.
base_model_wrapped = WrappedEffNet()
random_vector = tf.random.uniform((1, 224, 224, 3))
o1 = base_model_wrapped(random_vector)
o2 = base_model_wrapped(random_vector, training = False)
# Getting all non-trainable variable values from all BatchNormalization layers.
array_a = np.array([])
for layer in base_model_wrapped.model.layers:
if hasattr(layer, 'moving_mean'):
v = layer.moving_mean.numpy()
np.concatenate([array_a, v])
v = layer.moving_variance.numpy()
np.concatenate([array_a, v])
o3 = base_model_wrapped(random_vector, training = True) # Changing to True, shouldn't this update BatchNormalization non-trainable variables?
array_b = np.array([])
for layer in base_model_wrapped.model.layers:
if hasattr(layer, 'moving_mean'):
v = layer.moving_mean.numpy()
np.concatenate([array_b, v])
v = layer.moving_variance.numpy()
np.concatenate([array_b, v])
print(np.allclose(array_a, array_b)) # Shouldn't this be False?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在函数中,无法在顺序模型中调用基本模型的
呼叫
方法。但是,您可以将模型视为自定义层的模型:理智检查:
这是推理模式,无论
triench
的值如何。模型摘要与顺序相同:
编辑:为了查看
batchnormalization
:model.trainable = false
:model.trainable = trainable = true
> ,训练= true
:model.trainable = true
,训练= false
:It is not possible to invoke the
call
method of the base model in Sequential model as in Functional. However, you can think the model as if it is a custom layer:Sanity check:
It is inference mode regardless of the value of
training
.Model summary is the same as Sequential:
Edit: In order to see difference of
BatchNormalization
:model.trainable = False
:model.trainable = True
,training = True
:model.trainable = True
,training = False
: