与多输入多输出自动编码器的Keras拟合时的数据大小问题
我试图建立一个多输入的多输出编码器,以进行密集的数据表示。 (对体系结构的解释是,我希望一个通用网络对数字和分类数据重建进行优化。)
从我的原始数据中,我产生了一个28列宽的单热数组,用于分类数据,而另一个则产生了另一个数据。数值数据的17列广泛的归一化数组:
m_cat_train.shape # (59768, 28)
m_cat_test.shape # (3146, 28)
m_num_train.shape # (59768, 17)
m_num_test.shape # (3146, 17)
我的体系结构:
cat_input = Input(shape=(m_cat.shape[1],))
num_input = Input(shape=(m_num.shape[1],))
cat_enc = Dense(16, activation='relu')(cat_input)
cat_enc = Dense(8, activation='relu')(cat_enc)
num_enc = Dense(16, activation='relu')(num_input)
num_enc = Dense(8, activation='relu')(num_enc)
bottleneck = concatenate([cat_enc, num_enc])
cat_dec = Dense(8, activation='relu')(bottleneck)
cat_dec = Dense(16, activation='relu')(cat_dec)
cat_output = Dense(m_cat.shape[1], activation='sigmoid', name="cat_output")(cat_dec)
num_dec = Dense(8, activation='relu')(bottleneck)
num_dec = Dense(16, activation='relu')(num_dec)
num_output = Dense(m_num.shape[1], activation='linear', name="num_output")(num_dec)
model = Model(inputs=[cat_input, num_input],
outputs =[cat_output, num_output],
name="autoencoder")
但是,在培训中:
model.compile(optimizer="adam",
loss={"cat_output" : "categorical_crossentropy", "num_output":"mse"},
loss_weights={"cat_output": 1.0, "num_output": 1.0},
metrics={"cat_output": 'accuracy', "num_output": 'accuracy'})
hist = model.fit([m_cat_train, m_num_train], [m_cat_test, m_num_test],
batch_size=16,
epochs=16,
verbose=1)
我收到以下错误消息:
ValueError: Data cardinality is ambiguous:
x sizes: 59756, 59756
y sizes: 3146, 3146
Make sure all arrays contain the same number of samples.
这里有什么问题?据我所知,输入和输出的尺寸还可以,如何解释错误消息?这里的“基数”怎么样?
奖励问题相关:如何定义早期stopping
回调以拟合哪种指标将两个指标结合到一个停止条件中?
from keras.callbacks import EarlyStopping
earlyStopping = EarlyStopping(monitor='val_accuracy',
restore_best_weights=True,
mode='max')
在上述代码中观察到哪个val_accuracy
?
I try to build up a multi-input multi-output encoder for dense data representation. (The explanation for the architecture is that I wanted one common network to optimize for the numerical and the categorical data reconstruction in one.)
From my original data I produced a 28-column wide one-hot-encoded array for the categorical data and another 17-column wide normalized array for the numerical data:
m_cat_train.shape # (59768, 28)
m_cat_test.shape # (3146, 28)
m_num_train.shape # (59768, 17)
m_num_test.shape # (3146, 17)
My architecture:
cat_input = Input(shape=(m_cat.shape[1],))
num_input = Input(shape=(m_num.shape[1],))
cat_enc = Dense(16, activation='relu')(cat_input)
cat_enc = Dense(8, activation='relu')(cat_enc)
num_enc = Dense(16, activation='relu')(num_input)
num_enc = Dense(8, activation='relu')(num_enc)
bottleneck = concatenate([cat_enc, num_enc])
cat_dec = Dense(8, activation='relu')(bottleneck)
cat_dec = Dense(16, activation='relu')(cat_dec)
cat_output = Dense(m_cat.shape[1], activation='sigmoid', name="cat_output")(cat_dec)
num_dec = Dense(8, activation='relu')(bottleneck)
num_dec = Dense(16, activation='relu')(num_dec)
num_output = Dense(m_num.shape[1], activation='linear', name="num_output")(num_dec)
model = Model(inputs=[cat_input, num_input],
outputs =[cat_output, num_output],
name="autoencoder")
However, at training:
model.compile(optimizer="adam",
loss={"cat_output" : "categorical_crossentropy", "num_output":"mse"},
loss_weights={"cat_output": 1.0, "num_output": 1.0},
metrics={"cat_output": 'accuracy', "num_output": 'accuracy'})
hist = model.fit([m_cat_train, m_num_train], [m_cat_test, m_num_test],
batch_size=16,
epochs=16,
verbose=1)
I get the following error message:
ValueError: Data cardinality is ambiguous:
x sizes: 59756, 59756
y sizes: 3146, 3146
Make sure all arrays contain the same number of samples.
What is the problem here? As far as I see the dimensions of the inputs and outputs are OK, how to interpret the error message? How comes 'cardinality' here?
Bonus question related: how to define an EarlyStopping
callback for fitting which somehow combines the two metrics into one as stopping condition?
from keras.callbacks import EarlyStopping
earlyStopping = EarlyStopping(monitor='val_accuracy',
restore_best_weights=True,
mode='max')
Which val_accuracy
is observed in the above code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论