LSTM Keras-许多到许多分类值错误：不兼容的形状

发布于 2025-01-26 07:15:10 字数 4840 浏览 1 评论 0原文

我刚刚开始使用TensorFlow / keras在Python中实现LSTM来测试我的想法，但是我正在努力正确创建模型。这篇文章主要是关于我经常遇到的价值错误（请参阅底部的代码），但是所有和所有人都可以为下面的问题创建适当的LSTM模型。

对于每天，我想预测将发生哪一组事件。这个想法是，某些事件经过一定的时间之后 /始终发生 /总是发生，而其他事件仅发生或没有任何结构。 LSTM应该能够在这些反复发生的事件上进行挑选，以便预测其将来几天的情况。

为了显示事件，我使用一个值0和1的列表（不存在和事件）。因此，例如，如果我有事件[“上学”，“去健身房”，“购买计算机”]，我有[1，0，1]，[1，1，1，0]，[1]，[1 ，0，1]，[1，1，0]等。这样的想法是，LSTM会认识到我每天上学，每隔一天的健身房，而且购买计算机非常罕见。因此，按照向量的顺序，第二天应该预测[1,0,0]。

到目前为止，我已经完成了以下操作：

创建X_Train：具有形状的Numpy.Array（305，60，193）。 X_Train的每个条目都连续60天，其中一天由相同的193个事件的向量表示，如上所述。
创建y_train：带有形状的numpy.Array（305，1，193）。类似于x_train，但y_train每条条目仅包含1天。

x_train [0]由第1,2天，...，60，y_train [0]包含第61天。x_train [1]然后包含第2天，...，61和y_train [1]包含第62天，等。这个想法是，LSTM应该学会使用过去60天的数据，然后可以迭代地开始预测/生成未来几天事件发生的新向量。

我真的很努力如何创建可以处理此问题的LSTM的简单实现。到目前为止，我认为我已经弄清楚了以下内容：

我需要从以下代码块开始，其中n_inputs = 60和n_features = 193。我不确定应该是什么n_blocks，或者是否应采取的值严格受到某些条件的约束。编辑：根据 https://zhuanlan.zhihu.com/p/58854907 无论我想要什么，

model = Sequential()
model.add(LSTM(N_BLOCKS, input_shape=(N_INPUTS, N_FEATURES)))

我都应该添加一个密集的层。如果我希望LSTM的输出是193个事件的向量，则应如下：

model.add(layers.Dense(193,activation = 'linear') #or some other activation function

我还可以添加一个辍学层以防止过度拟合，例如使用model.add.layers.dropout（0.2）在某种程度上设置为0
。我不确定损失函数（例如MSE或eCTORICAL_CROSSTENTROPY）和优化器是否只想进行工作实现。
我需要训练我的模型，我可以使用model.fit（x_train，y_train）来实现，
如果以上所有操作都很好，我可以使用开始使用开始预测第二天的值。 Model.Predict（我想预测的一天的60天）

我的尝试之一可以在这里看到：

print(x_train.shape)
print(y_train.shape)

model = keras.Sequential()
model.add(layers.LSTM(256, input_shape=(x_train.shape[1], x_train.shape[2])))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(y_train.shape[2], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()
model.fit(x_train,y_train) #<- This line causes the ValueError

Output:
(305, 60, 193)
(305, 1, 193)
Model: "sequential_29"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_27 (LSTM)              (None, 256)               460800    
                                                                 
 dense_9 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 461,057
Trainable params: 461,057
Non-trainable params: 0
_________________________________________________________________
ValueError: Shapes (None, 1, 193) and (None, 193) are incompatible

或者，我尝试替换行model.add（layers.dense.dense.dense.drain.shape.shape.shape.shape） [2]，activation ='softmax'）带有model.add（layers.dense（y_train.shape [1]，activation ='softmax'））。这会产生valueerror：形状（无，1，193）和（无，1）不兼容。

我的想法还可以吗？如何解决此值错误？任何帮助将不胜感激。

编辑：如注释中建议的那样，更改y_train的大小做到了这一点。

print(x_train.shape)
print(y_train.shape)

model = keras.Sequential()
model.add(layers.LSTM(193, input_shape=(x_train.shape[1], x_train.shape[2]))) #De 193 mag ieder mogelijk getal zijn. zie: https://zhuanlan.zhihu.com/p/58854907
model.add(layers.Dropout(0.2))
model.add(layers.Dense(y_train.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()
model.fit(x_train,y_train)


(305, 60, 193)
(305, 193)
Model: "sequential_40"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_38 (LSTM)              (None, 193)               298764    
                                                                 
 dropout_17 (Dropout)        (None, 193)               0         
                                                                 
 dense_16 (Dense)            (None, 193)               37442     
                                                                 
=================================================================
Total params: 336,206
Trainable params: 336,206
Non-trainable params: 0
_________________________________________________________________
10/10 [==============================] - 3s 89ms/step - loss: 595.5011

现在，我遇到了以下事实：model.predict（x）要求x的大小与x_train相同，并且会输出与y_train相同的数组。我希望仅需要一组60天才能输出第61天。有人知道如何实现这一目标吗？

原文

I have just started with implementing a LSTM in Python with Tensorflow / Keras to test out an idea I had, however I am struggling to properly create a model. This post is mainly about a Value error that I often get (see the code at the bottom), but any and all help with creating a proper LSTM model for the problem below is greatly appreciated.

For each day, I want to predict which of a group of events will occur. The idea is that some events are recurring / always occur after a certain amount of time has passed, whereas other events occur only rarely or without any structure. A LSTM should be able to pick up on these recurring events, in order to predict their occurences for days in the future.

In order to display the events, I use a list with values 0 and 1 (non-occurence and occurence). So for example if I have the events ["Going to school", "Going to the gym" , "Buying a computer"] I have lists like [1, 0, 1], [1, 1, 0], [1, 0, 1], [1, 1, 0] etc. The idea is then that the LSTM will recognize that I go to school every day, the gym every other day and that buying a computer is very rare. So following the sequence of vectors, for the next day it should predict [1,0,0].

So far I have done the following:

Create x_train: a numpy.array with shape (305, 60, 193). Each entry of x_train contains 60 consecutive days, where day is represented by a vector of the same 193 events that can take place like described above.
Create y_train: a numpy.array with shape (305, 1, 193). Similar to x_train, but y_train only contains 1 day per entry.

x_train[0] consists of day 1,2,...,60 and y_train[0] contains day 61. x_train[1] then contains day 2,...,61 and y_train[1] contains day 62, etc. The idea is that the LSTM should learn to use data from the past 60 days, and that it can then iteratively start predicting/generating new vectors of event occurences for future days.

I am really struggling with how to create a simple implementation of a LSTM that can handle this. So far I think I have figured out the following:

I need to start with the below block of code, where N_INPUTS = 60 and N_FEATURES = 193. I am not sure what N_BLOCKS should be, or if the value it should take is strictly bound by some conditions. EDIT: According to https://zhuanlan.zhihu.com/p/58854907 it can be whatever I want

model = Sequential()
model.add(LSTM(N_BLOCKS, input_shape=(N_INPUTS, N_FEATURES)))

I should probably add a dense layer. If I want the output of my LSTM to be a vector with the 193 events, this should look as follows:

model.add(layers.Dense(193,activation = 'linear') #or some other activation function

I can also add a dropout layer to prevent overfitting, for example withmodel.add.layers.dropout(0.2) where the 0.2 is some rate at which things are set to 0.
I need to add a model.compile(loss = ..., optimizer = ...). I am not sure if the loss function (e.g. MSE or categorical_crosstentropy) and optimizer matter if I just want a working implementation.
I need to train my model, which I can achieve by using model.fit(x_train,y_train)
If all of the above works well, I can start to predict values for the next day using model.predict(the 60 days before the day I want to predict)

One of my attempts can be seen here:

print(x_train.shape)
print(y_train.shape)

model = keras.Sequential()
model.add(layers.LSTM(256, input_shape=(x_train.shape[1], x_train.shape[2])))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(y_train.shape[2], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()
model.fit(x_train,y_train) #<- This line causes the ValueError

Output:
(305, 60, 193)
(305, 1, 193)
Model: "sequential_29"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_27 (LSTM)              (None, 256)               460800    
                                                                 
 dense_9 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 461,057
Trainable params: 461,057
Non-trainable params: 0
_________________________________________________________________
ValueError: Shapes (None, 1, 193) and (None, 193) are incompatible

Alternatively, I have tried replacing the line model.add(layers.Dense(y_train.shape[2], activation='softmax')) with model.add(layers.Dense(y_train.shape[1], activation='softmax')). This produces ValueError: Shapes (None, 1, 193) and (None, 1) are incompatible.

Are my ideas somewhat okay? How can I resolve this Value Error? Any help would be greatly appreciated.

EDIT: As suggested in the comments, changing the size of y_train did the trick.

print(x_train.shape)
print(y_train.shape)

model = keras.Sequential()
model.add(layers.LSTM(193, input_shape=(x_train.shape[1], x_train.shape[2]))) #De 193 mag ieder mogelijk getal zijn. zie: https://zhuanlan.zhihu.com/p/58854907
model.add(layers.Dropout(0.2))
model.add(layers.Dense(y_train.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()
model.fit(x_train,y_train)


(305, 60, 193)
(305, 193)
Model: "sequential_40"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_38 (LSTM)              (None, 193)               298764    
                                                                 
 dropout_17 (Dropout)        (None, 193)               0         
                                                                 
 dense_16 (Dense)            (None, 193)               37442     
                                                                 
=================================================================
Total params: 336,206
Trainable params: 336,206
Non-trainable params: 0
_________________________________________________________________
10/10 [==============================] - 3s 89ms/step - loss: 595.5011

Now I am stuck on the fact that model.predict(x) requires x to be of the same size as x_train, and will output an array with the same size as y_train. I was hoping only one set of 60 days would be required to output the 61th day. Does anyone know how to achieve this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

猫弦 2025-02-02 07:15:10

该解决方案可能是具有形状的Y_TRAIN（305，193），而不是（305，1，193），因为您预测有一天，这不会改变数据，而只是其形状。然后，您应该能够训练和预测。
使用model.Add（layers.dense（y_train.shape [1]，activation ='softmax'））当然。

回复收藏 0 原文

~没有更多了~