计算LSTM的培训和测试精度

发布于 2025-01-26 12:10:50 字数 826 浏览 1 评论 0原文

我正在构建具有以下代码的LSTM模型，并希望计算模型的培训和测试精度。我是机器学习的新手，我知道的唯一计算准确性的方法是使用Sklearn的“准确性得分”。

y_train = pd.Series(y_train)
lstm_model = Sequential()
lstm_model.add(Embedding(top_words, 32, input_length=req_length))
lstm_model.add(Flatten())
input = (req_length, 32)
lstm_model.add(Reshape(input))
lstm_model.add(LSTM(units = 50, return_sequences = True))
lstm_model.add(Dropout(0.2))
lstm_model.add(Dense(256, activation='relu'))
lstm_model.add(Dropout(0.2))
lstm_model.add(Dense(1, activation='sigmoid'))
lstm_model.compile(optimizer='adam', loss='binary_crossentropy', 
metrics=['accuracy'])
lstm = lstm_model.fit(X_train, y_train, epochs = 30, batch_size = 10)

要计算y_pred，我将其写成y_pred = lstm_model.predict（y_test）。但是，由于其形状为（600，401，1），因此在y_pred上的精度得分函数。

我该怎么办或共享一些代码？

原文

I am building an LSTM model with the following code and I wish to calculate the training and testing accuracies of the model. I am a novice in machine learning and the only method I know for calculating the accuracy is using sklearn's "accuracy score".

y_train = pd.Series(y_train)
lstm_model = Sequential()
lstm_model.add(Embedding(top_words, 32, input_length=req_length))
lstm_model.add(Flatten())
input = (req_length, 32)
lstm_model.add(Reshape(input))
lstm_model.add(LSTM(units = 50, return_sequences = True))
lstm_model.add(Dropout(0.2))
lstm_model.add(Dense(256, activation='relu'))
lstm_model.add(Dropout(0.2))
lstm_model.add(Dense(1, activation='sigmoid'))
lstm_model.compile(optimizer='adam', loss='binary_crossentropy', 
metrics=['accuracy'])
lstm = lstm_model.fit(X_train, y_train, epochs = 30, batch_size = 10)

To calculate y_pred, I wrote it as y_pred = lstm_model.predict(y_test). However, the accuracy score function on y_pred as its shape is (600, 401, 1).

What can I do regarding this or share some code?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

烛影斜 2025-02-02 12:10:50

如果您打印lstm_model.summary（），您可能会看到：

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding (Embedding)       (None, 401, 32)           160000    
                                                                 
 flatten (Flatten)           (None, 12832)             0         
                                                                 
 reshape (Reshape)           (None, 401, 32)           0         
                                                                 
 lstm (LSTM)                 (None, 401, 50)           16600     
                                                                 
 dropout (Dropout)           (None, 401, 50)           0         
                                                                 
 dense (Dense)               (None, 401, 256)          13056     
                                                                 
 dropout_1 (Dropout)         (None, 401, 256)          0         
                                                                 
 dense_1 (Dense)             (None, 401, 1)            257       
                                                                 
=================================================================
Total params: 189,913
Trainable params: 189,913
Non-trainable params: 0
_________________________________________________________________

正如我们所注意到的，最后一个密集的层会产生Shape 的输出（无，401，1）。这个401数字出现在整个网络中，指序列中的元素数量（单词）。据我所知，它来自您的req_length变量。

现在，让我们仔细查看此代码线：

lstm_model.add(LSTM(units = 50, return_sequences = True))

在这里，您指定LSTM层。但是，等等，return_sequences = true是什么意思？根据文档：

return_sequences ：布尔值。是返回输出序列中的最后输出还是完整序列。默认值：false。

您在这里设置的内容取决于您的任务。在这里，您基本上告诉LSTM层具有与序列中单词数量一样多的输出。这样，您的LSTM输出中有401个向量。但是，正如我从您的模型体系结构中看到的那样，您想解决一些二进制分类任务。在这种情况下，仅从LSTM输出一个向量是更合乎逻辑的。因此，我建议将此参数设置为false ：

lstm_model.add(LSTM(units = 50, return_sequences = False))

将新的模型体系结构与上一个进行比较：

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding (Embedding)       (None, 401, 32)           160000    
                                                                 
 flatten (Flatten)           (None, 12832)             0         
                                                                 
 reshape (Reshape)           (None, 401, 32)           0         
                                                                 
 lstm (LSTM)                 (None, 50)                16600     
                                                                 
 dropout (Dropout)           (None, 50)                0         
                                                                 
 dense (Dense)               (None, 256)               13056     
                                                                 
 dropout_1 (Dropout)         (None, 256)               0         
                                                                 
 dense_1 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 189,913
Trainable params: 189,913
Non-trainable params: 0
_________________________________________________________________

现在您的LSTM层仅输出每个批次元素的大小50的向量，而不是401矢量。因此，最后，每个批次元素没有401个值。您只有一个值，这是输入序列的预测。

If you print lstm_model.summary(), you might see:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding (Embedding)       (None, 401, 32)           160000    
                                                                 
 flatten (Flatten)           (None, 12832)             0         
                                                                 
 reshape (Reshape)           (None, 401, 32)           0         
                                                                 
 lstm (LSTM)                 (None, 401, 50)           16600     
                                                                 
 dropout (Dropout)           (None, 401, 50)           0         
                                                                 
 dense (Dense)               (None, 401, 256)          13056     
                                                                 
 dropout_1 (Dropout)         (None, 401, 256)          0         
                                                                 
 dense_1 (Dense)             (None, 401, 1)            257       
                                                                 
=================================================================
Total params: 189,913
Trainable params: 189,913
Non-trainable params: 0
_________________________________________________________________

As we can notice, last Dense layer produces output of shape (None, 401, 1). This 401 number appears in the whole network and means the number of elements in the sequence (words). As far as I understand, it comes from your req_length variable.

Now let's look closer at this line of code:

lstm_model.add(LSTM(units = 50, return_sequences = True))

Here, you specify LSTM layer. But wait, what does return_sequences = True mean? According to the documentation:

return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence. Default: False.

What you set here depends on your task. Here, you basically tell LSTM layer to have as many outputs as the number of words in the sequence. This way, you have 401 vectors in the output from LSTM. However, as I see from your model architecture, you want to solve some binary classification task. In this case, it would be more logical to output only one vector from LSTM. Thus, I suggest setting this parameter to False:

lstm_model.add(LSTM(units = 50, return_sequences = False))

Compare new model architecture with the previous one:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding (Embedding)       (None, 401, 32)           160000    
                                                                 
 flatten (Flatten)           (None, 12832)             0         
                                                                 
 reshape (Reshape)           (None, 401, 32)           0         
                                                                 
 lstm (LSTM)                 (None, 50)                16600     
                                                                 
 dropout (Dropout)           (None, 50)                0         
                                                                 
 dense (Dense)               (None, 256)               13056     
                                                                 
 dropout_1 (Dropout)         (None, 256)               0         
                                                                 
 dense_1 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 189,913
Trainable params: 189,913
Non-trainable params: 0
_________________________________________________________________

Now your LSTM layer outputs only one vector of size 50 for each batch element, not 401 vectors. Therefore, in the end, you don't have 401 values for each batch element; you have only one value which is a prediction for the input sequence.

回复收藏 0 原文

~没有更多了~