使用神经网络进行回归

人工智能神经网络深度学习反向传播回归

2021-10-31 19:57:38

我想就以下问题寻求任何帮助：

我得到了以下训练数据：100 个数字，每个数字都是一个参数，它们一起定义了一个数字 X（也给定了）。这是一个实例，我有 20 000 个训练实例。接下来，我给出了 5000 行，每个包含 100 个数字作为参数。我的任务是预测这 5000 个实例的数字 X。

我被卡住了，因为到目前为止我只知道 sigmoid 激活函数，并且我认为它不适合输出值不是 0 或 1 的情况。

所以我的问题是：什么是激活函数的好选择？如何为这样的问题实现神经网络？

4个回答

让我们模拟一些数据。

“100个数字，每一个都是一个参数，它们共同定义了一个数字X（也给定了）”

# i.e. size of X_train -> [n x d]
# i.e. size of X_train -> [??? x 100]  , when d = 100

# "I have 20000 instances for training"
# i.e. size of X_train -> [20000 x 100], when n = 20000

import torch
import numpy as np

X_train = torch.rand((20000, 100))
X_train = np.random.rand(20000, 100) # Or using numpy

但是你的Y是什么？

# Since the definition of a regression task, 
# loosely means to predict an output real number
# given an input of d dimension

# So the appropriate Y_train would 
# be of dimension [n x 1] 
# and look like this:

y_train = torch.rand((20000, 1))

y_train = np.random.rand(20000, 1) # Or using numpy

什么是线性感知器？

从本教程中获取定义

因此，在图片中：

接下来我们需要定义训练例程，

现在把它当作圣经真理，这是训练神经网络模型的一个好的例程（这不是唯一的方法，而是最简单的或监督学习）：

在代码中：

import math
import numpy as np
np.random.seed(0)

def sigmoid(x): # Returns values that sums to one.
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(sx): 
    # See https://math.stackexchange.com/a/1225116
    # Hint: let sx = sigmoid(x)
    return sx * (1 - sx)

def cost(predicted, truth):
    return np.abs(truth - predicted)


num_epochs = 10000 # No. of times to iterate.
learning_rate = 0.03 # How large a step to take per iteration.

# Lets standardize and call our inputs X and outputs Y
X = np.array(torch.rand((20000, 100)))
Y = or_output

for _ in range(num_epochs):
    layer0 = X

    # Step 2a: Multiply the weights vector with the inputs, sum the products, i.e. s
    # Step 2b: Put the sum through the sigmoid, i.e. f()
    # Inside the perceptron, Step 2. 
    layer1 = sigmoid(np.dot(X, W))

    # Back propagation.
    # Step 3a: Compute the errors, i.e. difference between expected output and predictions
    # How much did we miss?
    layer1_error = cost(layer1, Y)

    # Step 3b: Multiply the error with the derivatives to get the delta
    # multiply how much we missed by the slope of the sigmoid at the values in layer1
    layer1_delta = layer1_error * sigmoid_derivative(layer1)

    # Step 3c: Multiply the delta vector with the inputs, sum the product (use np.dot)
    # Step 4: Multiply the learning rate with the output of Step 3c.
    W +=  learning_rate * np.dot(layer0.T, layer1_delta)

现在我们学习了模型，即W.

当我们看到需要使用模型的数据点时，我们应用相同的前向传播步骤，即layer1 = sigmoid(np.dot(X, W))

因为我们有：

我给出了 5000 行，每行包含 100 个数字作为参数。我的任务是预测这 5000 个实例的数字 X。

在代码中：

# If we mock up the data,
# it should be the same internal dimension. 
X_test = np.random.rand(5000, 100)

# The desired output just needs to pass through the W and the activation:
# the shape of `output` -> [5000 x 1] , 
# where there's 1 output value for each input.
output = sigmoid(np.dot(X_test, W))

快速的答案是你想在输出层上使用一个激活函数，它不会将值压缩到 $(0,1)$ . 根据您的软件，这可能被称为“线性”或“身份”。看起来 Keras 只是想让你放弃激活功能：model.add(Dense(1)).

将神经网络视为分类器（假设是二元分类器）的典型方式只是扩展逻辑回归。实际上，当您在输出节点上使用 sigmoid 激活函数时，您（某种程度上）在最终隐藏层上运行逻辑回归。

逻辑回归是一种广义线性模型。GLM 的要点是感兴趣值的一些变换是特征空间的线性函数。

让 $X$ 是特征空间的数据矩阵。让 $\beta$ 是一个参数向量。然后 $\hat{y} = \mathbb{E}[y] = X\beta$ 是线性模型，并且 $g(\mathbb{E}[y]) = X\beta$ 是广义线性模型（向量化，所以应用 $g$ 每个 $y_i$ ）。

但是我们可以将其扩展到非线性变换，当神经网络是二元分类器时，这正是我们正在做的事情。而不是改造 $X$ 被给予 $\beta$ 因此是线性的，我们应用一些非线性变换 $f$ 并得到 $g(\mathbb{E}[y]) = f(X)$ .

GLM 中的术语是“链接函数”，但这本质上是神经网络最终节点上的激活函数。因此，所有 GLM 链接函数都在起作用，其中一个链接函数是恒等函数。对于 GLM，这只是线性回归。对于您的神经网络，它将是一个神经网络（非线性）回归，这听起来像您想要的。

通常你首先对数据进行归一化，这意味着你的整个数据集将在 0 和 1 之间。在你进行模型预测之后，在计算成本函数或评估模型时，你可以应用归一化的逆功能。

对于回归，您可以使用带有 sigmoid 的隐藏层，然后使用 LINEAR 输出层，加权和直接通过，无需修改。

这样你的输出不限于0-1

其它你可能感兴趣的问题

上一篇如何为性别分类绘制边界框？下一篇为什么当前模型使用多个归一化层？