如何通过连接将 LSTM 输出输入到 MLP?

数据挖掘 机器学习 深度学习 喀拉斯 张量流 时间序列
2022-02-20 14:58:42

我有一个时间序列数据集的训练数据集,如下所示,我的目标变量是var1(t)var 1 在 time=t 时的值。

import numpy as np
import pandas as pd
train_df = pd.DataFrame(np.random.randint(0,100,size=(100, 16)))
train_df.columns=['var1(t-3)','var2(t-3)','var3(t-3)','var4(t-3)','var1(t-2)','var2(t-2)','var3(t-2)','var4(t-2)','var1(t-1)','var2(t-1)','var3(t-1)','var4(t-1)','var1(t)','var2(t)','var3(t)','var4(t)']
train_X=train_df.drop(['var1(t)'],axis=1)
train_y=train_df[['var1(t)']]

我正在输入具有所有 4 个变量的过去 3 个时间步长的 LSTM 网络,(t-3)然后(t-1)使用var2,var3,var4Keras 中具有功能 API 的 MLP 将 LSTM 的输出与当前时间步长值一起提供。

所以我准备了 LSTM 和 MLP 的输入,如下所示:

#subset the 3 previous timesteps of the 4 variables for the time sries part 
train_X_LSTM=train_X[train_X.columns[:12]].reset_index(drop=True).values
#target is always var1(t)
train_y_LSTM=train_y.values
#take the current timestep fatures which are var2,var3,var4 which are realised at t=t
train_X_MLP=train_X[train_X.columns[-3:]].reset_index(drop=True).values
#target is always var1(t)
train_y_MLP=train_y.values

然后我尝试了以下网络:

#lstm input shape
lstm_input = Input(shape=(train_X_LSTM.shape[0],train_X_LSTM.shape[1]))
#lstm units
hidden1 = LSTM(10)(lstm_input)
hidden2 = Dense(10, activation='relu')(hidden1)
#lstm output which will be predicted var1 at t=t
lstm_output = Dense(1, activation='sigmoid')(hidden2)
#mlp input with additonal 3 variables at t=t
mlp_input=Input(shape=(train_X_MLP.shape[0],train_X_MLP.shape[1]))
#combine the lstm output which is predicted var1 at t=t and key in var2,var3,var4 at t=t
x = concatenate([lstm_output, mlp_input], axis=-1)
#mlp model output which is predicted var1 at t=t
mlp_out = Dense(1, activation='relu')(x)
#final output of combined model which is predicted var1 at t=t
model = Model(inputs=[lstm_input, mlp_input],outputs=mlp_out)
#compile the model
model.compile(loss='mae', optimizer='adam')
#fit the model
model.fit(x_train, y_train, batch_size=64, epochs=10, validation_split=0.2)

这会引发错误

ValueError:Concatenate图层需要具有匹配形状的输入,但连接轴除外。得到输入形状:[(None, 1), (None, 100, 3)]

这表明我没有正确组合它们。任何帮助表示赞赏!

1个回答

在 Keras 中,无需将训练数据中的样本数量提供给模型。此外,您定义 LSTM 的方式意味着您将每个数据样本视为 LSTM 的时间步长,而不是您的 t-1、t-2 和 t-3 值。

train_X_LSTM.shape[0]所以你可以在你的Input中删除,并给予X=[train_X_LSTM, train_X_MLP]y=train_y_LSTM给予,model.fit所以它符合你的模型所期望的。

#lstm input shape
lstm_input = Input(shape=(train_X_LSTM.shape[1], 1))
#lstm units
hidden1 = LSTM(10)(lstm_input)
hidden2 = Dense(10, activation='relu')(hidden1)
#lstm output which will be predicted var1 at t=t
lstm_output = Dense(1, activation='sigmoid')(hidden2)
#mlp input with additonal 3 variables at t=t
mlp_input=Input(shape=(train_X_MLP.shape[1]))
#combine the lstm output which is predicted var1 at t=t and key in var2,var3,var4 at t=t
x = Concatenate()([lstm_output, mlp_input])
#mlp model output which is predicted var1 at t=t
mlp_out = Dense(1, activation='relu')(x)
#final output of combined model which is predicted var1 at t=t
model = Model(inputs=[lstm_input, mlp_input],outputs=mlp_out)
#compile the model
model.compile(loss='mae', optimizer='adam')
#fit the model
model.fit([train_X_LSTM, train_X_MLP], train_y_LSTM, batch_size=64, epochs=10, validation_split=0.2)

另外,据我了解,您希望在时间步 t-3、t-2、t-1 向 LSTM 输入 (var1, var2, var3, var4) 值向量。在这种情况下,您还应该重塑您提供给 LSTM 的数据

train_X_LSTM=train_X_LSTM.reshape(-1, 3, 4)

这样,您将在过去的相应时间步中输入包含 (var1, var2, var3, var4) 的 3 个向量序列。然后,更新 LSTM 输入层

lstm_input = Input(shape=train_X_LSTM.shape[1:])

完整代码

import numpy as np
import pandas as pd
train_df = pd.DataFrame(np.random.randint(0,100,size=(100, 16)))
train_df.columns=['var1(t-3)','var2(t-3)','var3(t-3)','var4(t-3)','var1(t-2)','var2(t-2)','var3(t-2)','var4(t-2)','var1(t-1)','var2(t-1)','var3(t-1)','var4(t-1)','var1(t)','var2(t)','var3(t)','var4(t)']
train_X=train_df.drop(['var1(t)'],axis=1)
train_y=train_df[['var1(t)']]
#subset the 3 previous timesteps of the 4 variables for the time sries part 
train_X_LSTM=train_X[train_X.columns[:12]].reset_index(drop=True).values
train_X_LSTM=train_X_LSTM.reshape(-1, 3, 4)
#target is always var1(t)
train_y_LSTM=train_y.values
#take the current timestep fatures which are var2,var3,var4 which are realised at t=t
train_X_MLP=train_X[train_X.columns[-3:]].reset_index(drop=True).values
#target is always var1(t)
train_y_MLP=train_y.values

from tensorflow.keras.layers import Input, LSTM, Dense, Concatenate
from tensorflow.keras.models import Model
#lstm input shape
lstm_input = Input(shape=train_X_LSTM.shape[1:])
#lstm units
hidden1 = LSTM(10)(lstm_input)
hidden2 = Dense(10, activation='relu')(hidden1)
#lstm output which will be predicted var1 at t=t
lstm_output = Dense(1, activation='sigmoid')(hidden2)
#mlp input with additonal 3 variables at t=t
mlp_input=Input(shape=(train_X_MLP.shape[1]))
#combine the lstm output which is predicted var1 at t=t and key in var2,var3,var4 at t=t
x = Concatenate()([lstm_output, mlp_input])
#mlp model output which is predicted var1 at t=t
mlp_out = Dense(1, activation='relu')(x)
#final output of combined model which is predicted var1 at t=t
model = Model(inputs=[lstm_input, mlp_input],outputs=mlp_out)
#compile the model
model.compile(loss='mae', optimizer='adam')
#fit the model
model.fit([train_X_LSTM, train_X_MLP], train_y_LSTM, batch_size=64, epochs=10, validation_split=0.2)