神经网络无法跟踪训练数据

数据挖掘 机器学习 Python 神经网络 张量流 回归
2022-02-26 05:22:55

我是 ML 新手,这是我的第一个 Tensorflow 项目。我正在使用神经网络对具有 17 个特征和 1 个结果的数据集进行回归。但由于某种原因,我的网络无法跟踪训练数据。从下面的图中可以看出,我的结果出现了巨大的错误。我也尝试过使用不同的参数(学习率、每层节点、层数等),但似乎没有任何效果。我在这里粘贴了 Tensorflow 代码。我还提供了我的成本和训练图以及我的数据集的链接。如果有人能帮我弄清楚我做错了什么,我将不胜感激。谢谢!

数据集的链接-按列排列的功能-https://drive.google.com/file/d/1U182Lhf67WygeSbv6BNEx5LHyL7Ba13O/view ?usp = sharing 输出列 -https: //drive.google.com/file/d/10XWo1d5mhIsxccQBgAyGDWDAVgu2BjAA/查看?usp=共享

在此处输入图像描述 在此处输入图像描述

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

# importing features and observations data for training and validation
training_filename_X = "training_set_X.csv"
training_filename_Y = "training_set_Y.csv"
validation_filename_X = "validation_set_X.csv"
test_filename_X = "test_set_X.csv"
test_filename_Y = "test_set_Y.csv"
validation_filename_Y = "validation_set_Y.csv"
training_features = np.loadtxt(training_filename_X, delimiter=',')
training_observations = np.loadtxt(training_filename_Y, delimiter=',')
validation_features = np.loadtxt(validation_filename_X, delimiter=',')
validation_observations = np.loadtxt(validation_filename_Y, delimiter=',')
test_features = np.loadtxt(test_filename_X, delimiter=',')
test_observations = np.loadtxt(test_filename_Y, delimiter=',')

# normalizing training data
training_features_stddev_arr = np.std(training_features, axis=0)
training_features_mean_arr = np.mean(training_features, axis=0)
normalized_training_features = (training_features-training_features_mean_arr)/training_features_stddev_arr

# layer parameters
n_nodes_hl1 = 5
n_nodes_hl2 = 5
n_nodes_hl3 = 3
no_features = 17
learning_rate = 0.001
epochs = 2000

cost_history = []

X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

# defining weights for each layer taken from a normal distribution with variance 2/n
hl1_weight = tf.Variable(tf.random_normal([no_features, n_nodes_hl1], stddev=np.sqrt(2/no_features)))
hl2_weight = tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2], stddev=np.sqrt(2/n_nodes_hl1)))
hl3_weight = tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3], stddev=np.sqrt(2/n_nodes_hl2)))
output_weight = tf.Variable(tf.random_normal([n_nodes_hl3, 1], stddev=np.sqrt(2/n_nodes_hl3)))

# defining biases for each layer
hl1_bias = tf.Variable(tf.random_uniform([n_nodes_hl1], -1.0, 1.0))
hl2_bias = tf.Variable(tf.random_uniform([n_nodes_hl2], -1.0, 1.0))
hl3_bias = tf.Variable(tf.random_uniform([n_nodes_hl3], -1.0, 1.0))
output_bias = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

# defining activation functions for each layer
hl1 = tf.sigmoid(tf.matmul(X, hl1_weight) + hl1_bias)
hl2 = tf.sigmoid(tf.matmul(hl1, hl2_weight) + hl2_bias)
hl3 = tf.sigmoid(tf.matmul(hl2, hl3_weight) + hl3_bias)
output = tf.matmul(hl3, output_weight) + output_bias

# using mean squared error cost function
cost  = tf.reduce_mean(tf.square(output - Y))

# using Gradient Descent algorithm
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)


init = tf.global_variables_initializer()

# running the network
with tf.Session() as sess:
    sess.run(init)

    for step in np.arange(epochs):
        sess.run(optimizer, feed_dict={X:normalized_training_features, Y:training_observations})
#        print (sess.run(cost, feed_dict={X:normalized_training_features, Y:training_observations}))
        cost_history.append(sess.run(cost,feed_dict={X:normalized_training_features, Y:training_observations}))

    pred_y = sess.run(output, feed_dict={X:normalized_training_features})

    plt.plot(range(len(cost_history)), cost_history)
1个回答

好的,在询问了我的朋友之后,我至少发现了一件事为什么会这样,

我们不能将 sigmoid 用作最后一层,因为它总是会输出 0 到 1 之间的值,因此这就是为什么您的预测根本没有变化的原因……也就是说,我们根本不能将 sigmoid 用作最后一层,直到它成为比方说图像的分类和所有......