valueerror：形状（无，16）和（无，16，16）是不兼容的（LSTMS）

发布于 2025-01-18 15:08:14 字数 2792 浏览 1 评论 0原文

我正在为印地语翻译模型构建英语，并且我不断遇到这个错误。我仍然是新手，所以我无法弄清楚自己的错误。我使用了编码器模型，但仍必须为解码器构建推理模型。我提到了我的教程，但是它们都没有使用过文本矢量化，这是有原因的吗？

所有教程或文章也将受到赞赏。谢谢！

我的模型是：

en_inputs = Input(shape=(1,),batch_size=32,dtype='string')
x = text_vectorizor_english(en_inputs)
x = embedding_english(x)
en_lstm = LSTM(256,return_state=True)
en_op,en_h,en_c = en_lstm(x)
en_states = \[en_h,en_c\]

de_inputs = Input(shape=(1,),batch_size=32,dtype='string')
y = text_vectorizer_hindi(de_inputs)
y = embedding_hindi(y)
de_lstm = LSTM(256,return_sequences=True,return_state=True)
de_op,\_,w = de_lstm(y,initial_state=en_states)
de_den = TimeDistributed(Dense(max_length,activation='softmax'))
de_out = de_den(de_op)

//y_t is the decoder target

y_t_t = text_vectorizer_hindi(y_t)
y_t_t

model_1 = Model(\[en_inputs,de_inputs\],de_out)
model_1.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=\['accuracy'\])
model_1.fit(\[X_train,y_train\],y_t_t,
epochs=10,
batch_size=32,
validation_split=0.2)

对于model_1.summary（）：模型：“ model_5”

_________________________________________________________________________________________________

Layer (type)                   Output Shape         Param #     Connected to
=

input_42 (InputLayer)          \[(32, 1)\]            0           \[\]

input_46 (InputLayer)          \[(32, 1)\]            0           \[\]

text_vectorization (TextVector  (None, 16)          0           \['input_42\[0\]\[0\]'\]  
ization)

text_vectorization_1 (TextVect  (None, 16)          0           \['input_46\[0\]\[0\]'\]  
orization)

embedding_2 (Embedding)        multiple             2560000     \['text_vectorization\[18\]\[0\]'\]

embedding_3 (Embedding)        multiple             2560000     \['text_vectorization_1\[3\]\[0\]'\]

lstm_33 (LSTM)                 \[(None, 256),        525312      \['embedding_2\[19\]\[0\]'\]  
(None, 256),  
(None, 256)\]

lstm_36 (LSTM)                 \[(None, 16, 256),    525312      \['embedding_3\[4\]\[0\]',  
(None, 256),                     'lstm_33\[0\]\[1\]',  
(None, 256)\]                     'lstm_33\[0\]\[2\]'\]

dense_7 (Dense)                (None, 16, 16)       4112        \['lstm_36\[0\]\[0\]'\]

==================================================================================================
Total params: 6,174,736
Trainable params: 6,174,736
Non-trainable params: 0

__________________________________________________________________________________________________

我尝试了sparse_categorical_crossentropy，并有一个错误：

收到的标签值为9914，其有效范围不超过[0，16）。标签值：49 275 537等。

我搜索了该错误，有人发布了尝试lose ='mean_squared_error'我也尝试过并遇到了一个错误：

不兼容的形状：[32,16,16] vs. [32,16]

原文

I am building a English to Hindi translation model and I keep getting this error. I am still new to this so I couldn't figure out my error. I used the encoder-decoder model and i still have to build the inference model for decoder. I referred my tutorials but none of them used text vectorization is there a reason for that?

Any tutorials or articles are also appreciated. Thanks!

My model is:

en_inputs = Input(shape=(1,),batch_size=32,dtype='string')
x = text_vectorizor_english(en_inputs)
x = embedding_english(x)
en_lstm = LSTM(256,return_state=True)
en_op,en_h,en_c = en_lstm(x)
en_states = \[en_h,en_c\]

de_inputs = Input(shape=(1,),batch_size=32,dtype='string')
y = text_vectorizer_hindi(de_inputs)
y = embedding_hindi(y)
de_lstm = LSTM(256,return_sequences=True,return_state=True)
de_op,\_,w = de_lstm(y,initial_state=en_states)
de_den = TimeDistributed(Dense(max_length,activation='softmax'))
de_out = de_den(de_op)

//y_t is the decoder target

y_t_t = text_vectorizer_hindi(y_t)
y_t_t

model_1 = Model(\[en_inputs,de_inputs\],de_out)
model_1.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=\['accuracy'\])
model_1.fit(\[X_train,y_train\],y_t_t,
epochs=10,
batch_size=32,
validation_split=0.2)

For model_1.summary() :
Model: "model_5"

_________________________________________________________________________________________________

Layer (type)                   Output Shape         Param #     Connected to
=

input_42 (InputLayer)          \[(32, 1)\]            0           \[\]

input_46 (InputLayer)          \[(32, 1)\]            0           \[\]

text_vectorization (TextVector  (None, 16)          0           \['input_42\[0\]\[0\]'\]  
ization)

text_vectorization_1 (TextVect  (None, 16)          0           \['input_46\[0\]\[0\]'\]  
orization)

embedding_2 (Embedding)        multiple             2560000     \['text_vectorization\[18\]\[0\]'\]

embedding_3 (Embedding)        multiple             2560000     \['text_vectorization_1\[3\]\[0\]'\]

lstm_33 (LSTM)                 \[(None, 256),        525312      \['embedding_2\[19\]\[0\]'\]  
(None, 256),  
(None, 256)\]

lstm_36 (LSTM)                 \[(None, 16, 256),    525312      \['embedding_3\[4\]\[0\]',  
(None, 256),                     'lstm_33\[0\]\[1\]',  
(None, 256)\]                     'lstm_33\[0\]\[2\]'\]

dense_7 (Dense)                (None, 16, 16)       4112        \['lstm_36\[0\]\[0\]'\]

==================================================================================================
Total params: 6,174,736
Trainable params: 6,174,736
Non-trainable params: 0

__________________________________________________________________________________________________

I tried sparse_categorical_crossentropy and got an error: