valueerror:形状(无,16)和(无,16,16)是不兼容的(LSTMS)

发布于 2025-01-18 15:08:14 字数 2792 浏览 1 评论 0原文

我正在为印地语翻译模型构建英语,并且我不断遇到这个错误。我仍然是新手,所以我无法弄清楚自己的错误。我使用了编码器模型,但仍必须为解码器构建推理模型。我提到了我的教程,但是它们都没有使用过文本矢量化,这是有原因的吗?

所有教程或文章也将受到赞赏。谢谢!

我的模型是:

en_inputs = Input(shape=(1,),batch_size=32,dtype='string')
x = text_vectorizor_english(en_inputs)
x = embedding_english(x)
en_lstm = LSTM(256,return_state=True)
en_op,en_h,en_c = en_lstm(x)
en_states = \[en_h,en_c\]

de_inputs = Input(shape=(1,),batch_size=32,dtype='string')
y = text_vectorizer_hindi(de_inputs)
y = embedding_hindi(y)
de_lstm = LSTM(256,return_sequences=True,return_state=True)
de_op,\_,w = de_lstm(y,initial_state=en_states)
de_den = TimeDistributed(Dense(max_length,activation='softmax'))
de_out = de_den(de_op)

//y_t is the decoder target

y_t_t = text_vectorizer_hindi(y_t)
y_t_t

model_1 = Model(\[en_inputs,de_inputs\],de_out)
model_1.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=\['accuracy'\])
model_1.fit(\[X_train,y_train\],y_t_t,
epochs=10,
batch_size=32,
validation_split=0.2)

对于model_1.summary(): 模型:“ model_5”

_________________________________________________________________________________________________

Layer (type)                   Output Shape         Param #     Connected to
=

input_42 (InputLayer)          \[(32, 1)\]            0           \[\]

input_46 (InputLayer)          \[(32, 1)\]            0           \[\]

text_vectorization (TextVector  (None, 16)          0           \['input_42\[0\]\[0\]'\]  
ization)

text_vectorization_1 (TextVect  (None, 16)          0           \['input_46\[0\]\[0\]'\]  
orization)

embedding_2 (Embedding)        multiple             2560000     \['text_vectorization\[18\]\[0\]'\]

embedding_3 (Embedding)        multiple             2560000     \['text_vectorization_1\[3\]\[0\]'\]

lstm_33 (LSTM)                 \[(None, 256),        525312      \['embedding_2\[19\]\[0\]'\]  
(None, 256),  
(None, 256)\]

lstm_36 (LSTM)                 \[(None, 16, 256),    525312      \['embedding_3\[4\]\[0\]',  
(None, 256),                     'lstm_33\[0\]\[1\]',  
(None, 256)\]                     'lstm_33\[0\]\[2\]'\]

dense_7 (Dense)                (None, 16, 16)       4112        \['lstm_36\[0\]\[0\]'\]

==================================================================================================
Total params: 6,174,736
Trainable params: 6,174,736
Non-trainable params: 0

__________________________________________________________________________________________________

​

我尝试了sparse_categorical_crossentropy,并有一个错误:

收到的标签值为9914,其有效范围不超过[0,16)。标签值:49 275 537等。

我搜索了该错误,有人发布了尝试lose ='mean_squared_error'我也尝试过并遇到了一个错误:

不兼容的形状:[32,16,16] vs. [32,16]

I am building a English to Hindi translation model and I keep getting this error. I am still new to this so I couldn't figure out my error. I used the encoder-decoder model and i still have to build the inference model for decoder. I referred my tutorials but none of them used text vectorization is there a reason for that?

Any tutorials or articles are also appreciated. Thanks!

My model is:

en_inputs = Input(shape=(1,),batch_size=32,dtype='string')
x = text_vectorizor_english(en_inputs)
x = embedding_english(x)
en_lstm = LSTM(256,return_state=True)
en_op,en_h,en_c = en_lstm(x)
en_states = \[en_h,en_c\]

de_inputs = Input(shape=(1,),batch_size=32,dtype='string')
y = text_vectorizer_hindi(de_inputs)
y = embedding_hindi(y)
de_lstm = LSTM(256,return_sequences=True,return_state=True)
de_op,\_,w = de_lstm(y,initial_state=en_states)
de_den = TimeDistributed(Dense(max_length,activation='softmax'))
de_out = de_den(de_op)

//y_t is the decoder target

y_t_t = text_vectorizer_hindi(y_t)
y_t_t

model_1 = Model(\[en_inputs,de_inputs\],de_out)
model_1.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=\['accuracy'\])
model_1.fit(\[X_train,y_train\],y_t_t,
epochs=10,
batch_size=32,
validation_split=0.2)

For model_1.summary() :
Model: "model_5"

_________________________________________________________________________________________________

Layer (type)                   Output Shape         Param #     Connected to
=

input_42 (InputLayer)          \[(32, 1)\]            0           \[\]

input_46 (InputLayer)          \[(32, 1)\]            0           \[\]

text_vectorization (TextVector  (None, 16)          0           \['input_42\[0\]\[0\]'\]  
ization)

text_vectorization_1 (TextVect  (None, 16)          0           \['input_46\[0\]\[0\]'\]  
orization)

embedding_2 (Embedding)        multiple             2560000     \['text_vectorization\[18\]\[0\]'\]

embedding_3 (Embedding)        multiple             2560000     \['text_vectorization_1\[3\]\[0\]'\]

lstm_33 (LSTM)                 \[(None, 256),        525312      \['embedding_2\[19\]\[0\]'\]  
(None, 256),  
(None, 256)\]

lstm_36 (LSTM)                 \[(None, 16, 256),    525312      \['embedding_3\[4\]\[0\]',  
(None, 256),                     'lstm_33\[0\]\[1\]',  
(None, 256)\]                     'lstm_33\[0\]\[2\]'\]

dense_7 (Dense)                (None, 16, 16)       4112        \['lstm_36\[0\]\[0\]'\]

==================================================================================================
Total params: 6,174,736
Trainable params: 6,174,736
Non-trainable params: 0

__________________________________________________________________________________________________

​

I tried sparse_categorical_crossentropy and got an error:

Received a label value of 9914 which is outside the valid range of [0, 16). Label values: 49 275 537 and so on.

I searched for that error and someone posted to try loss='mean_squared_error' I tried that too and got an error:

Incompatible shapes: [32,16,16] vs. [32,16]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文