有关张量流模型可重复性的问题
我目前正在开发TensorFlow模型,并遇到了有关其可重复性的问题。
我构建了一个以恒定值初始化的简单密集模型,并用虚拟数据训练。
import tensorflow as tf
weight_init = tf.keras.initializers.Constant(value=0.001)
inputs = tf.keras.Input(shape=(5,))
layer1 = tf.keras.layers.Dense(5,
activation=tf.nn.leaky_relu,
kernel_initializer=weight_init,
bias_initializer=weight_init)
outputs = layer1(inputs)
model = tf.keras.Model(inputs=inputs, outputs=outputs, name="test")
model.compile(loss='mse',
optimizer = 'Adam'
)
model.fit([[111,1.02,-1.98231,1,1],
[112,1.02,-1.98231,1,1],
[113,1.02,-1.98231,1,1],
[114,1.02,-1.98231,1,1],
[115,1.02,-1.98231,1,1],
[116,1.02,-1.98231,1,1],
[117,1.02,-1.98231,1,1]],
[1,1,1,1,1,1,2], epochs = 3, batch_size=1)
即使我将模型的初始价值设置为0.001,但每次尝试的训练损失都会发生变化。...
我在这里缺少什么?我还有其他值可以解决吗?
更令人惊讶的是,如果我将batch_size更改为16,损失不会随着尝试而改变
..教我伙计们...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
由于
keras.model.fit()
具有默认的kwargshuffle = true
,因此数据将被散布。如果将batch_size
更改为比数据长度大的整数,则任何混音都将无效,因为只剩下一批。因此,在
mode.fit()
中添加shuffle = false
将在此处实现可重复性。此外,如果您的模型越来越大,真正的可重复性问题将出现,即,即使您没有随机或洗牌,两个连续计算的结果也会有一些错误,但只需单击“运行”,然后
单击运行。我们将其绘制为
确定性
用于可重复性。确定性
是一个很好的问题,通常很容易被许多用户忽略。让我们从结论开始,即,可重复性是影响:
operation_seed
+sideen_global_seed
)如何做? 在构建或还原模型之前添加以下代码。
但是,仅当您严重依赖重复性时,才可以使用它,因为
tf.config.experiment.enable.enable_op_determinism()
将大大降低速度。更深层次的原因是,硬件降低了一些准确性以加快计算,这通常不会影响我们的算法。除了深度学习之外,模型非常大,很容易出现领先的错误,并且训练周期非常长,累积的错误。如果在回归模型中,任何额外的错误都是不可接受的,因此我们需要确定性算法。Since
keras.model.fit()
has default kwargshuffle=True
, data will be shuffled cross batch. If you changebatch_size
to any integer that larger than data length, any shuffle will be invalid, because there left only one batch.So, add
shuffle=False
inmodel.fit()
will achieve reproducibility here.Additionally, if your model grows bigger, real reproducibility problem will arise, i.e, there will be slight error in the results of two successive calculations, even though you do no random or shuffle, but just click run, then
click run. We draw this as
determinism
for reproducibility.Determinism
is a good question that usually easily ignored by many users.Let's start with the conclusion, i.e., reproducibility is influence by:
operation_seed
+hidden_global_seed
)How to do? Tensorflow determinism has declared precisicely, i.e, add the following codes before building or restoring the model.
But it can be used only if you rely heavily on reproducibility, since
tf.config.experimental.enable_op_determinism()
will reduce the speed significantly. The deeper reason is that, hardware reduces some accuracy in order to speed up the calculation, which usually does not affect our algorithm. Except in the deep learning, the model is very large, leading errors occured easily, and the training cycle is very long, leading accumulated errors. If in a regression model, any extra error is unacceptable, so we need deterministic algorithm in this occasion.