如何将BERT嵌入到LSTM层

发布于 2025-01-23 12:58:29 字数 660 浏览 0 评论 0 原文

我想使用Bert-ebedding和LSTM层进行情感分析。 这是我的代码:

i = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
x = bert_preprocess(i)
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(x)

model = tf.keras.Model(i, x)

编译此代码时,我会收到以下错误:

ValueError: Input 0 of layer "lstm_2" is incompatible with the layer: expected 
ndim=3, found ndim=2. Full shape received: (None, 768)

代码的逻辑是否正确?有人可以更正我的代码吗?

I want to do sentiment analysis using bert-embedding and lstm layer.
This is my code:

i = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
x = bert_preprocess(i)
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(x)

model = tf.keras.Model(i, x)

When compiling this code I got the following error:

ValueError: Input 0 of layer "lstm_2" is incompatible with the layer: expected 
ndim=3, found ndim=2. Full shape received: (None, 768)

Is the logic of my code correct? Can anyone please correct my code?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

疯到世界奔溃 2025-01-30 12:58:29

从BERT类似于模型,您通常可以期待三种输出(取自

  • last_hidden_​​state 带有shape(batch_size,sequence_length,sideen_size)
  • pooler_output with shape(batch_size,batch_size,sideen_size)
  • sideen_side_states with shape(batch_size ,batch_size,batch_size,batch_size,, sequence_length,hidden_​​size)

hidden_​​size 是上面的768。

x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)

因此,如果您打算在bert_encoder层之后使用LSTM层,则需要以的形式对LSTM进行三维输入(batch_size,num_timesteps,num_features)因此,您都必须使用任何一个 hidden_​​state last_hidden_​​state 输出而不是 pooler_output
您必须根据目标/用例在两者之间进行选择。

From bert like models you can expect generally three kinds of outputs (taken from huggingface's TFBertModel documentation)

  • last_hidden_state with shape (batch_size, sequence_length, hidden_size)
  • pooler_output with shape (batch_size, hidden_size)
  • hidden_states with shape (batch_size, sequence_length, hidden_size)

hidden_size is 768 above..

As the error says, the output from dropout layer lacks 3 dimensions (essentially the bert_encoder layer because dropout layers do not change tensor shape) and has only 2 dimensions.

x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)

So if you are planning to use an LSTM layer after the bert_encoder layer, you would need a three dimensional input to the LSTM in the form of (batch_size, num_timesteps, num_features) hence you would have to use either the hidden_states or the last_hidden_state outputs instead of pooler_output.
You will have to choose between the two depending on your objective/use-case.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文