递归神经网络 (LSTM) 尺寸误差

数据挖掘 lstm rnn
2022-03-10 03:46:22

我在一个名为ddf如下的数据框中有数据:

labels          X
L1          [1,2,3,7,8,9...]
L1          [4,2,6,9,8,7...]
...
L2          [5,6,8,9,6,3...]
L2          [7,8,5,6,9,0...]
...

X 下的每个列表中有 250 行、7 个标签和 2000 个元素。这 2000 个元素是大约 60 秒周期内的信号值。

我正在尝试为上述数据建立一个循环神经网络。以下是我的代码:

Xall = ddf['X'].values
Xall = np.array(Xall)

ydf = pd.get_dummies(ddf.drop('X', axis=1))
Yall = np.array(ydf.values)

# Split the data
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(Xall, Yall,  test_size=0.1, random_state=0) 

from keras.models import Sequential
from keras.layers import Embedding, LSTM, Dense
model_lstm = Sequential()
model_lstm.add(Embedding(2000, 128))
model_lstm.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model_lstm.add(LSTM(200, dropout=0.2, recurrent_dropout=0.2))
model_lstm.add(Dense(Yall.shape[1], activation='softmax'))
model_lstm.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model_lstm.fit(X_train, Y_train, epochs=50, verbose=True, validation_data=(X_test, Y_test))

但是,我在第二个 LSTM 层遇到错误:

ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2

我认为这与 LSTM 论点有关。嵌入层的参数也可以吗?这两个是如何调整的?错误来自哪里,如何解决?谢谢你的帮助。

1个回答

评论后编辑

嵌入层的参数也可以吗?

是的。但是您需要传入return_sequences = True第一个 LSTM 层,以便它将序列传递到下一个 LSTM 层。

从文档

return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.

这两个是如何调整的?

文档

input_dim:词汇表的大小,即最大整数索引+1。这决定了输入数据中的最大整数。输入中的最大整数不应大于词汇表大小。这应该是

output_dim:密集嵌入的维度。整数 >= 0

input_length:输入序列的长度,当它是常数时。如果要在上游连接 Flatten 然后 Dense 层,则需要此参数(没有它,无法计算密集输出的形状)。

我在下面发布虚拟数据的代码。词汇量已取 100。

from keras.models import Sequential
from keras.layers import Embedding, LSTM, Dense
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
input_array = np.random.randint(100, size=(250, 2000))
input_y = np.random.randint(7, size = (250))

Y_dumy = pd.get_dummies(input_y)
X_train, X_test, Y_train, Y_test = train_test_split(input_array, Y_dumy,  
test_size=0.1, random_state=0)

model = Sequential()
model.add(Embedding(input_dim = 100, output_dim = 64, input_length=2000))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2, return_sequences 
=True))
model.output_shape

#Output shape should be:
#model.output_shape = (None, 2000, 64)
#3D tensor with shape: (batch_size, sequence_length, output_dim)

model.add(LSTM(200, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(Y_dumy.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics= 
['accuracy'])
model.fit(X_train, Y_train, epochs=50, verbose=True, validation_data= 
(X_test, Y_test))`

错误来自哪里,如何解决?

我相信错误是由于缺少 input_length 而出现的。对于类似的错误,请查看此帖子

评论后 错误是因为return_sequences =False在第一个 LSTM 层。