我正在尝试使用LSTM和预训练的BERT嵌入以及随后与Transformer的语言翻译一起使用情感分类
首先,我下载
!pip install ktrain
!pip install tensorflow_text
并导入了必要的lib
import pathlib
import random
import numpy as np
from typing import Tuple, List
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from sklearn.model_selection import train_test_split
# tensoflow imports
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import (
TextVectorization, LSTM, Dense, Embedding, Dropout,
Layer, Input, MultiHeadAttention, LayerNormalization)
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.initializers import Constant
from tensorflow.keras import backend as K
import tensorflow_text as tf_text
import ktrain
from ktrain import text
,然后从斯坦福大学下载并提取了大型电影数据集
!wget https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
!tar -xzf aclImdb_v1.tar.gz
1-我尝试通过使用texts_from_folder ktrain.text模块的texts_from_folder函数来将LSTM与火车一起使用
%reload_ext autoreload
%autoreload 2
%matplotlib inline
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID";
os.environ["CUDA_VISIBLE_DEVICES"]="0";
DATADIR ='/content/aclImdb'
trn, val, preproc = text.texts_from_folder(DATADIR,max_features=20000, maxlen=400, ngram_range=1, preprocess_mode='standard', train_test_names=['train', 'test'],classes=['pos', 'neg'])
,我是试图构建LSTM对她的模型,
K.clear_session()
def build_LSTM_model(
embedding_size: int,
total_words: int,
lstm_hidden_size: int,
dropout_rate: float) -> Sequential:
model.add(Embedding(input_dim = total_words,output_dim=embedding_size,input_length=total_words))
model.add(LSTM(lstm_hidden_size,return_sequences=True,name="lstm_layer"))
model.add(GlobalMaxPool1D())
# model.add(Dense(total_words, activation='softmax'))
model.add(Dropout(dropout_rate))
model.add(Dense(MAX_SEQUENCE_LEN, activation="relu"))
# adam = Adam(lr=0.01)
model.compile(loss='CategoricalCrossentropy', optimizer=Adam(lr=0.01), metrics=['CategoricalAccuracy'])
model.summary()
model = Sequential()
并具有以下要求的顺序模型,该模型应包括:
一开始一个嵌入层。 (注意正确的参数化!)
至少一个LSTM层。至少一个用于正则化的辍学层。一个最终密集层映射到输出。
编译模型,具有分类_CrossentRopy损失和ADAM优化器。或者可能想添加其他类型的指标,例如在这里有意义。
然后,我想使用 ktrain库's get_learner 方法来创建一个易于训练的先前模型版本。并将测试集用作 val_data ,以查看性能。 (不包括适当的火车验证测试拆分,但可以根据需要进行延长。)
我正在使用学习者的 lr_find 和 lr_plot 方法来确定模型最有效的学习率。通过指定 max_epochs lr_find 的参数限制所需的时间。几个时代!根据情节确定最佳学习率。并在最快的收敛与稳定性之间找到
learner: ktrain.Learner
model = text.text_classifier('bert', trn , preproc=preproc)
learner.lr_find()
learner.lr_plot()
learner.fit_onecycle(1e-4, 1)
平衡
ValueError Trackback(最近的最新电话)
在 ()
6#工人= 8,use_multiprocessing = false,batch_size = 64)
7
----> 8模型= text.text_classifier('bert',trn,preproc = preproc)
10 # learner.lr_find()
1 frames
/usr/local/lib/python3.7/dist-packages/ktrain/text/models.py in _text_model(name, train_data, preproc, multilabel, classification, metrics, verbose)
109 raise ValueError(
110 "if '%s' is selected model, then preprocess_mode='%s' should be used and vice versa"
--> 111 % (BERT, BERT)
112 ) 113 is_huggingface = U.is_huggingface(data=train_data)
ValueError: if 'bert' is selected model, then preprocess_mode='bert' should be used and vice versa
和下一步,用 lstm prained 静态>静态词向量向量
I am trying to use Sentiment Classification with LSTM and pre-trained BERT embeddings, and later language translation with Transformer
first of all I downloaded
!pip install ktrain
!pip install tensorflow_text
And I imported the necessary lib
import pathlib
import random
import numpy as np
from typing import Tuple, List
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from sklearn.model_selection import train_test_split
# tensoflow imports
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import (
TextVectorization, LSTM, Dense, Embedding, Dropout,
Layer, Input, MultiHeadAttention, LayerNormalization)
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.initializers import Constant
from tensorflow.keras import backend as K
import tensorflow_text as tf_text
import ktrain
from ktrain import text
And I downloaded and extracted Large Movie dataset from Stanford
!wget https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
!tar -xzf aclImdb_v1.tar.gz
1- I try to use LSTM with train by Creating the training and test sets with the texts_from_folder function of the ktrain.text module
%reload_ext autoreload
%autoreload 2
%matplotlib inline
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID";
os.environ["CUDA_VISIBLE_DEVICES"]="0";
DATADIR ='/content/aclImdb'
trn, val, preproc = text.texts_from_folder(DATADIR,max_features=20000, maxlen=400, ngram_range=1, preprocess_mode='standard', train_test_names=['train', 'test'],classes=['pos', 'neg'])
And I am trying to build LSTM model her
K.clear_session()
def build_LSTM_model(
embedding_size: int,
total_words: int,
lstm_hidden_size: int,
dropout_rate: float) -> Sequential:
model.add(Embedding(input_dim = total_words,output_dim=embedding_size,input_length=total_words))
model.add(LSTM(lstm_hidden_size,return_sequences=True,name="lstm_layer"))
model.add(GlobalMaxPool1D())
# model.add(Dense(total_words, activation='softmax'))
model.add(Dropout(dropout_rate))
model.add(Dense(MAX_SEQUENCE_LEN, activation="relu"))
# adam = Adam(lr=0.01)
model.compile(loss='CategoricalCrossentropy', optimizer=Adam(lr=0.01), metrics=['CategoricalAccuracy'])
model.summary()
model = Sequential()
with the following requirements for a sequential model The model should include:
One Embedding layer at the beginning. (Watch out for proper parameterization!)
At least one LSTM layer.At least one Dropout layer for regularization.One final Dense layer mapping to the outputs.
compile model, with categorical_crossentropy loss and the adam optimizer. or might want to add other types of metrics for example CategoricalAccuracy makes sense here.
And then I want to use the ktrain library's get_learner method to create an easily trainable version of the previous model. and to use test set as the val_data, to see the performance. ( not include the proper train-validation-test split, but it could be extended if required.)
I am using the learner's lr_find and lr_plot methods to determine the most effective learning rate for the model. by Specifying the max_epochs parameter of lr_find to limit the time this takes. a couple of epochs! to determine the best learning rate based on the plot. and find balance between the fastest convergence and stability
learner: ktrain.Learner
model = text.text_classifier('bert', trn , preproc=preproc)
learner.lr_find()
learner.lr_plot()
learner.fit_onecycle(1e-4, 1)
I faced following errors
ValueError Traceback (most recent call last)
in ()
6 # workers=8, use_multiprocessing=False, batch_size=64)
7
----> 8 model = text.text_classifier('bert', trn , preproc=preproc)
10 # learner.lr_find()
1 frames
/usr/local/lib/python3.7/dist-packages/ktrain/text/models.py in _text_model(name, train_data, preproc, multilabel, classification, metrics, verbose)
109 raise ValueError(
110 "if '%s' is selected model, then preprocess_mode='%s' should be used and vice versa"
--> 111 % (BERT, BERT)
112 ) 113 is_huggingface = U.is_huggingface(data=train_data)
ValueError: if 'bert' is selected model, then preprocess_mode='bert' should be used and vice versa
And next step to make it with LSTM with pretrained static word vectors
发布评论
评论(1)
如果您将BERT用于验证的单词向量作为LSTM的功能,则无需构建单独的BERT分类模型。您可以使用
Transformerembedding
为您的数据集生成单词向量(或使用):这就是在 ktrain 中做。
同样,BERT模型的输入特征格式与LSTM的输入特征完全不同。正如错误消息所指示的那样,要为BERT分类模型进行预处理,您需要提供
Preprocess_mode ='Bert'
toxts_from_folder 。If you're using BERT for pretrained word vectors supplied as features to an LSTM, then you don't need to build a separate BERT classification model. You can use
TransformerEmbedding
to generate word vectors for your dataset (or use sentence-transformers):This is what the included NER models in ktrain do under-the-hood.
Also, the input feature format for a BERT model is completely different than input features for an LSTM. As the error message indicates, to preprocess your texts for BERT classification model, you'll need to supply
preprocess_mode='bert'
totexts_from_folder
.