使用TF.Data.Dataset和Numpy阵列进行模型培训产生不同的结果
我使用KERAS模型训练API并观察到使用Numpy阵列训练模型(X_TRAIN
和y_train
)和tf.data.data.dataset.forom_tensor_slices((((((((((((((((((((((((((((((,),我, x_train,y_train))
。一个最小的工作示例如下:
import numpy as np
import tensorflow as tf
tf.keras.utils.set_random_seed(0)
n_examples, n_dims = (100, 10)
raw_dataset = np.random.randn(n_examples, n_dims)
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(
1024, activation="relu", use_bias=True
),
tf.keras.layers.Dense(
1, activation="linear", use_bias=True
),
]
)
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="mse",
)
x_train = raw_dataset[:, :-1]
y_train = raw_dataset[:, -1]
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
n_epochs = 10
batch_size = 16
use_dataset = True
if use_dataset:
model.fit(
dataset.batch(batch_size=batch_size),
epochs=n_epochs,
)
else:
model.fit(
x=x_train,
y=y_train,
batch_size=batch_size,
epochs=n_epochs,
)
print("Evaluation:")
model.evaluate(x_train, y_train)
model.evaluate(dataset.batch(batch_size=batch_size))
如果我使用use_dataset = true
运行此代码,则最终性能是:
Evaluation:
4/4 [==============================] - 0s 825us/step - loss: 0.4132
7/7 [==============================] - 0s 701us/step - loss: 0.4132
如果我使用use_dataset = false
运行它,我得到:
Evaluation:
4/4 [==============================] - 0s 855us/step - loss: 0.4219
7/7 [==============================] - 0s 808us/step - loss: 0.4219
i。预计两个训练循环的性能将相同。有趣的是,如果我设置batch_size = n_examples
,则模型性能是相同的。区别似乎与内部处理批处理的方式有关。为什么会发生这种情况?是错误还是功能?
I use the Keras model training API and observed differences when training the model with NumPy arrays (x_train
and y_train
) and with tf.data.Dataset.from_tensor_slices((x_train, y_train))
. A minimal working example is shown below:
import numpy as np
import tensorflow as tf
tf.keras.utils.set_random_seed(0)
n_examples, n_dims = (100, 10)
raw_dataset = np.random.randn(n_examples, n_dims)
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(
1024, activation="relu", use_bias=True
),
tf.keras.layers.Dense(
1, activation="linear", use_bias=True
),
]
)
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="mse",
)
x_train = raw_dataset[:, :-1]
y_train = raw_dataset[:, -1]
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
n_epochs = 10
batch_size = 16
use_dataset = True
if use_dataset:
model.fit(
dataset.batch(batch_size=batch_size),
epochs=n_epochs,
)
else:
model.fit(
x=x_train,
y=y_train,
batch_size=batch_size,
epochs=n_epochs,
)
print("Evaluation:")
model.evaluate(x_train, y_train)
model.evaluate(dataset.batch(batch_size=batch_size))
If I run this code with use_dataset = True
, the final performance is:
Evaluation:
4/4 [==============================] - 0s 825us/step - loss: 0.4132
7/7 [==============================] - 0s 701us/step - loss: 0.4132
If I run it with use_dataset = False
, I get:
Evaluation:
4/4 [==============================] - 0s 855us/step - loss: 0.4219
7/7 [==============================] - 0s 808us/step - loss: 0.4219
I expected that the two training loops would perform identically. Interestingly, the model performance is identical if I set batch_size = n_examples
. The difference seems to be related with the way that batches are handled internally. Why is this happening? Is it a bug or a feature?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
该行为是由于默认参数
shuffle = true
inmodel.fit(*)
而不是错误。根据 docs >::因此,当传递
tf.data.dataset
时,此参数将被忽略,并且在每个时期之后,数据不会像其他数组中的其他方法一样重新封装。这是获得两种方法相同结果的代码:
The behavior is due to the default parameter
shuffle=True
inmodel.fit(*)
and not a bug. According to the docs regardingshuffle
:So this parameter is ignored when a
tf.data.Dataset
is passed, and the data is not reshuffled after each epoch as in the other approach with arrays.Here is the code to get the same results for both methods: