如何将BERT嵌入到LSTM层

发布于 2025-01-23 12:58:29 字数 660 浏览 0 评论 0 原文

我想使用Bert-ebedding和LSTM层进行情感分析。这是我的代码：

i = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
x = bert_preprocess(i)
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(x)

model = tf.keras.Model(i, x)

编译此代码时，我会收到以下错误：

ValueError: Input 0 of layer "lstm_2" is incompatible with the layer: expected 
ndim=3, found ndim=2. Full shape received: (None, 768)

代码的逻辑是否正确？有人可以更正我的代码吗？

原文

I want to do sentiment analysis using bert-embedding and lstm layer.
This is my code:

i = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
x = bert_preprocess(i)
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(x)

model = tf.keras.Model(i, x)

When compiling this code I got the following error:

ValueError: Input 0 of layer "lstm_2" is incompatible with the layer: expected 
ndim=3, found ndim=2. Full shape received: (None, 768)

Is the logic of my code correct? Can anyone please correct my code?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

疯到世界奔溃 2025-01-30 12:58:29

从BERT类似于模型，您通常可以期待三种输出（取自）

last_hidden_state 带有shape（batch_size，sequence_length，sideen_size）
pooler_output with shape（batch_size，batch_size，sideen_size）
sideen_side_states with shape（batch_size ，batch_size，batch_size，batch_size，， sequence_length，hidden_size）

hidden_size 是上面的768。

x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)

因此，如果您打算在bert_encoder层之后使用LSTM层，则需要以的形式对LSTM进行三维输入（batch_size，num_timesteps，num_features）因此，您都必须使用任何一个 hidden_state 或 last_hidden_state 输出而不是 pooler_output 。
您必须根据目标/用例在两者之间进行选择。

From bert like models you can expect generally three kinds of outputs (taken from huggingface's TFBertModel documentation)

last_hidden_state with shape (batch_size, sequence_length, hidden_size)
pooler_output with shape (batch_size, hidden_size)
hidden_states with shape (batch_size, sequence_length, hidden_size)

hidden_size is 768 above..

As the error says, the output from dropout layer lacks 3 dimensions (essentially the bert_encoder layer because dropout layers do not change tensor shape) and has only 2 dimensions.

x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)

So if you are planning to use an LSTM layer after the bert_encoder layer, you would need a three dimensional input to the LSTM in the form of (batch_size, num_timesteps, num_features) hence you would have to use either the hidden_states or the last_hidden_state outputs instead of pooler_output.
You will have to choose between the two depending on your objective/use-case.