更新
你的例子很有趣。一方面,它的构造方式是你真的只需要一个参数,它的值为 1:
yt=β+wyt−1β=0w=1
您的训练数据集很小(96 个观察值),但是使用三层网络,您有很多参数。很容易过拟合。
最有趣的部分是您的测试代码。目前尚不清楚您是在尝试进行一系列单步预测还是动态多步预测。
在一步预测中,您预测时间 t 并得到y^t=f(xt)=f(yt−1). 因此,您始终使用最新观察到的信息进行预测,以提前一步进行预测,然后进行下一个时间段。
注意上面我是如何使用的yt−1并不是y^t−1. 这是重要的区别:在一步预测中,您始终使用上一步的观察值。相比之下,动态预测使用先前的预测来得出下一个:y^t=f(y^t−1). 这就是为什么它被称为动态的。
因此,首先,我重新安排了您的代码并进行了修改,以使其生成单步和动态预测以进行比较。下面是输出:
# In[50]:
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN
from sklearn.metrics import mean_squared_error
data = [0,1,2,3,2,1]*20
import numpy as np
def shape_it(X):
return np.expand_dims(X.reshape((-1,1)),2)
from keras import regularizers
from numpy.random import seed
# In[51]:
n_data = len(data)
data = np.matrix(data)
n_train = int(0.8*n_data)
# In[52]:
X_train = shape_it(data[:,:n_train])
Y_train = shape_it(data[:,1:(n_train+1)])
X_test = shape_it(data[:,n_train:-1])
Y_test = shape_it(data[:,(n_train+1):])
# In[26]:
plt.plot(X_train.reshape(-1,1))
plt.plot(Y_train.reshape(-1,1))
plt.show()
# In[27]:
plt.plot(X_test.reshape(-1,1))
plt.plot(Y_test.reshape(-1,1))
plt.show()
# In[75]:
model = Sequential()
batch_size = 1
model.add(SimpleRNN(12, batch_input_shape=(batch_size, X_train.shape[1], X_train.shape[2]),stateful=True))
model.add(Dense(12))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
epochs = 1000
for i in range(epochs):
model.fit(X_train, np.reshape(Y_train,(-1,)), epochs=1, batch_size=batch_size, verbose=0, shuffle=False)
model.reset_states()
# build state
model.reset_states()
model.predict(X_train, batch_size=batch_size)
predictions = list()
for i in range(len(X_test)):
# make one-step forecast
X = X_test[i]
X = X.reshape(1, 1, 1)
yhat = model.predict(X, batch_size=batch_size)[0,0]
# store forecast
predictions.append(yhat)
expected = Y_test[ i ]
print('Month=%d, Predicted=%f, Expected=%f' % (i+1, yhat, expected))
# report performance
rmse = np.sqrt(mean_squared_error(Y_test.reshape(len(Y_test)), predictions))
print('Test RMSE: %.3f' % rmse)
# line plot of observed vs predicted
plt.plot(Y_test.reshape(len(Y_test)))
plt.plot(predictions)
plt.show()
现在我们得到了您所期望的图片。您的原始代码有几个问题。一是对于这个特定问题,ReLU 不是一个好主意。您有线性问题,因此“线性”或默认激活应该更好。第二个问题是您必须在fit
函数中使用 stateful=True 调用。最后,我更改了预测实现,使其成为一步预测。
这还不错,但这只是一步预测。接下来,我们将尝试进行如前所述的动态预测。
# build state
model.reset_states()
model.predict(X_train, batch_size=batch_size)
dynpredictions = list()
dyhat = X_test[0]
for i in range(len(X_test)):
# make one-step forecast
dyhat = yhat.reshape(1, 1, 1)
dyhat = model.predict(dyhat, batch_size=batch_size)[0,0]
# store forecast
dynpredictions.append(dyhat)
expected = Y_test[ i ]
print('Month=%d, Predicted Dynamically=%f, Expected=%f' % (i+1, dyhat, expected))
drmse = np.sqrt(mean_squared_error(Y_test.reshape(len(Y_test)), dynpredictions))
print('Test Dynamic RMSE: %.3f' % drmse)
# line plot of observed vs predicted
plt.plot(Y_test.reshape(len(Y_test)))
plt.plot(dynpredictions)
plt.show()
如下图所示,动态预测看起来并不那么好。回想一下,现在我们没有样本了,并且我们没有使用超出观察 #96 的观察值,这与一步预测不同。尽管如此,我们还是想解决这个问题,因为这个问题对我们来说太明显了,我们希望 NN 也能解决它。
我将尝试一个不同的 NN,它只有一个隐藏层,并通过正则化来对抗过度拟合,如下所示。
seed(1)
modelR = Sequential()
batch_size = 1
modelR.add(SimpleRNN(4, batch_input_shape=(batch_size, X_train.shape[1], X_train.shape[2]),stateful=True,
kernel_regularizer=regularizers.l2(0.01),
activity_regularizer=regularizers.l1(0.)))
modelR.add(Dense(1,kernel_regularizer=regularizers.l2(0.01),
activity_regularizer=regularizers.l1(0.)))
modelR.compile(loss='mean_squared_error', optimizer='adam')
epochs = 1000
for i in range(epochs):
modelR.fit(X_train, np.reshape(Y_train,(-1,)), epochs=1, batch_size=batch_size, verbose=0, shuffle=False)
modelR.reset_states()
# build state
modelR.reset_states()
modelR.predict(X_train, batch_size=batch_size)
predictions = list()
for i in range(len(X_test)):
# make one-step forecast
X = X_test[i]
X = X.reshape(1, 1, 1)
yhat = modelR.predict(X, batch_size=batch_size)[0,0]
# store forecast
predictions.append(yhat)
expected = Y_test[ i ]
print('Month=%d, Predicted=%f, Expected=%f' % (i+1, yhat, expected))
# report performance
rmse = np.sqrt(mean_squared_error(Y_test.reshape(len(Y_test)), predictions))
print('Test RMSE: %.3f' % rmse)
# line plot of observed vs predicted
plt.plot(Y_test.reshape(len(Y_test)))
plt.plot(predictions)
plt.show()
新模型仍在进行一步预测,如下所示。
现在让我们尝试动态预测。
# build state
modelR.reset_states()
modelR.predict(X_train, batch_size=batch_size)
dynpredictions = list()
dyhat = X_test[0]
for i in range(len(X_test)):
# make one-step forecast
dyhat = dyhat.reshape(1, 1, 1)
dyhat = modelR.predict(dyhat, batch_size=batch_size)[0,0]
# store forecast
dynpredictions.append(dyhat)
expected = Y_test[ i ]
print('Month=%d, Predicted Dynamically=%f, Expected=%f' % (i+1, dyhat, expected))
drmse = np.sqrt(mean_squared_error(Y_test.reshape(len(Y_test)), dynpredictions))
print('Test Dynamic RMSE: %.3f' % drmse)
# line plot of observed vs predicted
plt.plot(Y_test.reshape(len(Y_test)))
plt.plot(dynpredictions)
plt.show()
现在动态预测似乎也起作用了!