数据挖掘 - TensorFlow MLP 损失增加 - 吾爱随笔录

TensorFlow MLP 损失增加

数据挖掘神经网络张量流

2022-03-10 17:23:14

当我训练我的模型时，每个时期的损失都会增加。我觉得这是一个简单的解决方案，我遗漏了一些明显的东西，但我无法弄清楚它是什么。任何帮助将不胜感激。

神经网络：

def neural_network(data):
    hidden_L1 = {'weights': tf.Variable(tf.random_normal([784, neurons_L1])),
                'biases': tf.Variable(tf.random_normal([neurons_L1]))}

    hidden_L2 = {'weights': tf.Variable(tf.random_normal([neurons_L1, neurons_L2])),
                'biases': tf.Variable(tf.random_normal([neurons_L2]))}

    output_L = {'weights': tf.Variable(tf.random_normal([neurons_L2, num_of_classes])),
                'biases': tf.Variable(tf.random_normal([num_of_classes]))}

    L1 = tf.add(tf.matmul(data, hidden_L1['weights']), hidden_L1['biases']) #matrix multiplication
    L1 = tf.nn.relu(L1)

    L2 = tf.add(tf.matmul(L1, hidden_L2['weights']), hidden_L2['biases']) #matrix multiplication
    L2 = tf.nn.relu(L2)

    output = tf.add(tf.matmul(L2, output_L['weights']), output_L['biases']) #matrix multiplication
    output = tf.nn.softmax(output)

    return output

我每个时期的损失、优化器和循环：

output = neural_network(x)
loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=y) )
optimiser = tf.train.AdamOptimizer().minimize(loss)

init = tf.global_variables_initializer()

epochs = 5
total_batch_count = 60000//batch_size

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(epochs):

        avg_loss = 0 

        for i in range(total_batch_count):

            batch_x, batch_y = next_batch(batch_size, x_train, y_train)

            _, c = sess.run([optimiser, loss], feed_dict = {x:batch_x, y:batch_y})

            avg_loss +=c/total_batch_count

            print("epoch = ", epoch + 1, "loss =", avg_loss)

    sess.close()

我有一种感觉，我的问题在于我为每个时期编写的损失函数或循环，但是我是 TensorFlow 的新手，无法弄清楚这一点。

1个回答

您正在使用函数softmax_cross_entropy_with_logits，根据 Tensorflow 的文档，该函数具有以下 logits 规范，

logits：每个标签的激活，通常是线性输出。这些活化能被解释为非标准化的对数概率。

因此，您应该在非线性应用程序（在您的情况下为 softmax）之前传递激活。您可以通过执行以下操作来修复它，

def neural_network(data):
    hidden_L1 = {'weights': tf.Variable(tf.random_normal([784, neurons_L1])),
                'biases': tf.Variable(tf.random_normal([neurons_L1]))}

    hidden_L2 = {'weights': tf.Variable(tf.random_normal([neurons_L1, neurons_L2])),
                'biases': tf.Variable(tf.random_normal([neurons_L2]))}

    output_L = {'weights': tf.Variable(tf.random_normal([neurons_L2, num_of_classes])),
                'biases': tf.Variable(tf.random_normal([num_of_classes]))}

    L1 = tf.add(tf.matmul(data, hidden_L1['weights']), hidden_L1['biases']) #matrix multiplication
    L1 = tf.nn.relu(L1)

    L2 = tf.add(tf.matmul(L1, hidden_L2['weights']), hidden_L2['biases']) #matrix multiplication
    L2 = tf.nn.relu(L2)

    logits = tf.add(tf.matmul(L2, output_L['weights']), output_L['biases']) #matrix multiplication
    output = tf.nn.softmax(logits)

    return output, logits

然后，在您的函数之外，您可以检索 logits，并将其传递给您的损失函数，如下例所示，

output, logits = neural_network(x)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits,
                                                             labels=y))

我注意到您可能仍然对输出张量感兴趣，以计算网络的准确性。如果这种替换不起作用，您还应该尝试使用 AdamOptimizer 上的学习率参数（请参阅此处的文档）。

其它你可能感兴趣的问题

上一篇Python First Project：从引用中找到一本书及其作者的算法下一篇将在已经标准化的数据集上获得的权重除以特征的标准差？（岭回归）