数据挖掘 - 在keras RNN中损失输出为nan - 吾爱随笔录

在keras RNN中损失输出为nan

数据挖掘喀拉斯 rnn 损失函数

2021-09-24 19:27:23

从 RNN 的第一个 Epoch 开始，损失值被输出为 nan。

纪元 1/100 9787/9787 [===============================] - 22s 2ms/step - loss: nan

我已经对数据进行了标准化。

    ...,
    [9.78344703e-01],
    [1.00000000e+00],
    [9.94293976e-01]]])
我的 X_train 示例（大小为 (9787,60,1) 的 float64）

array([6.59848480e-04, 6.98212803e-04, 6.90540626e-04, ...,
   1.00000000e+00, 9.94293976e-01, 9.95909540e-01])

我的 y_train 示例（float64 大小 (9787,)）

我的循环神经网络：

# Initialising the RNN
regressor = Sequential()

# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True, input_shape =        
(X_train.shape[1], 1)))
regressor.add(Dropout(0.2))

# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a fourth LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))

# Adding the output layer
regressor.add(Dense(units = 1))

# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fitting the RNN to the Training set
regressor.fit(X_train, y_train, epochs = 100, batch_size = 32)

2个回答

可能是梯度爆炸引起的，尝试使用梯度裁剪查看损失是否仍然显示为nan. 例如：

from keras import optimizers

optimizer = optimizers.Adam(clipvalue=0.5)
regressor.compile(optimizer=optimizer, loss='mean_squared_error')

您的数据集中某处可能存在nan值。我在另一个数据集上运行了上面的代码，它执行没有问题。

也就是说，我没有在第一层指定输入形状——而是在初始化 RNN 之前这样做。

检查您的数据集是否有任何错误，但您也可以考虑以下修改。

# reshape input to be [samples, time steps, features]
X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))

# Initialising the RNN
regressor = tf.keras.Sequential()

# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a fourth LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))

# Adding the output layer
regressor.add(Dense(units = 1))

# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fitting the RNN to the Training set
regressor.fit(X_train, Y_train, epochs = 100, batch_size = 32)

其它你可能感兴趣的问题

上一篇为什么变压器位置编码同时使用正弦和余弦？下一篇使用 XGBoost 获取每个观察的特征重要性