如何将BERT嵌入到LSTM层
我想使用Bert-ebedding和LSTM层进行情感分析。 这是我的代码:
i = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
x = bert_preprocess(i)
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(x)
model = tf.keras.Model(i, x)
编译此代码时,我会收到以下错误:
ValueError: Input 0 of layer "lstm_2" is incompatible with the layer: expected
ndim=3, found ndim=2. Full shape received: (None, 768)
代码的逻辑是否正确?有人可以更正我的代码吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
从BERT类似于模型,您通常可以期待三种输出(取自)
last_hidden_state
带有shape(batch_size,sequence_length,sideen_size)pooler_output
with shape(batch_size,batch_size,sideen_size)sideen_side_states with shape(batch_size ,batch_size,batch_size,batch_size,, sequence_length,hidden_size)
hidden_size 是上面的768。
因此,如果您打算在bert_encoder层之后使用LSTM层,则需要以
的形式对LSTM进行三维输入(batch_size,num_timesteps,num_features)
因此,您都必须使用任何一个hidden_state
或last_hidden_state
输出而不是pooler_output
。您必须根据目标/用例在两者之间进行选择。
From bert like models you can expect generally three kinds of outputs (taken from huggingface's TFBertModel documentation)
last_hidden_state
with shape (batch_size, sequence_length, hidden_size)pooler_output
with shape (batch_size, hidden_size)hidden_states
with shape (batch_size, sequence_length, hidden_size)hidden_size
is 768 above..As the error says, the output from dropout layer lacks 3 dimensions (essentially the bert_encoder layer because dropout layers do not change tensor shape) and has only 2 dimensions.
So if you are planning to use an LSTM layer after the bert_encoder layer, you would need a three dimensional input to the LSTM in the form of
(batch_size, num_timesteps, num_features)
hence you would have to use either thehidden_states
or thelast_hidden_state
outputs instead ofpooler_output
.You will have to choose between the two depending on your objective/use-case.