数据挖掘 - 在 keras/tensorflow 中使用 dropout 测量 LSTM 网络中的不确定性 - 吾爱随笔录

我创建了一个简单的 LSTM 网络进行测试

model = tf.keras.Sequential()
model.add(layers.LSTM(32, input_shape = (timesteps, data_dim), recurrent_dropout = 0.2))
model.add(layers.Dense(1))
model.compile(loss = 'mae', metrics = ['accuracy'], optimizer = tf.train.AdamOptimizer())

我在这里使用 Yarin Gal和 Lingxue Zhu研究的dropout 方法，我的 dropout 函数如下所示：

f = K.function([model.layers[0].input, K.learning_phase()],
               [model.layers[-1].output])

训练后我在“predict_with_dropout”中使用上面的函数

def predict_with_dropout(x, f=f, n_iter=100):
    result = np.zeros((n_iter,))
    #print(f([x,1]))
    for iter in range(n_iter):
        result[iter] = f([x, 1])[0]

    return result
results = []


for point in test_X:
    results+= [predict_with_dropout([point])]

results_avg = np.apply_along_axis(np.mean, 1, results)
variance = np.apply_along_axis(np.var, 1, results)

这段代码按预期工作，据我了解，“predict_with_dropout”函数正在使用 f 函数重新训练 LSTM 模型 100 次，并且在这 100 次中它会丢弃模型的某些单元格。

这是论文的正确实施还是我遗漏了什么？如果它是正确的 - 有没有办法加快速度？