数据挖掘 - 神经网络数据规范化设置 - 吾爱随笔录

我对神经网络相当陌生，并且对数据规范化有一些疑问。

我正在尝试使用“mean_squared_error”损失函数在输出层上构建一个包含两个神经元的回归神经网络。我遇到的问题是其中一个输出规模较小（值在 0.9 到 1.4 之间），而另一个输出规模更大（值在 60 到 80 之间）。

我的理解是多个输出神经元使用下面的公式来计算MSE

MSE = (|op1 - targ1|^2 + |op2 - targ2|^2 ) / 2

鉴于尺度偏离了一个数量级，我相信我必须在应用神经网络之前对训练数据集的目标列进行归一化，以确保它们的权重相等并且尺度不会影响结果。

我还读到，当输入数据归一化时，ANN 表现更好，所以我继续通过减去平均值并除以标准偏差（z 分数归一化）来归一化所有数据集。

这是我在输入规范化后用于构建顺序模型的代码：

def build_model(n_layers, input_dim, units, activation):#, initializer):
    if isinstance(units, list):
        assert len(units) == n_layers
    else:
        units = [units] * n_layers


model = Sequential()

# Adds first hidden layer with input_dim parameter
model.add(Dense(units=units[0], 
                input_dim=input_dim, 
                activation=activation,
                kernel_regularizer=l2(0),
                #kernel_initializer=initializer, 
                name='h1'))
model.add(BatchNormalization())


# Adds remaining hidden layers
for i in range(2, n_layers + 1):
    model.add(Dense(units=units[i-1], 
                    activation=activation,
                    kernel_regularizer=l2(0),
                    #kernel_initializer=initializer, 
                    name='h{}'.format(i)))
    model.add(BatchNormalization())


# Adds output layer
model.add(Dense(units=2,name='o'))#,kernel_initializer=initializer))

# Compiles the model

optimizer = Adam(lr= 0.01, beta_1=0.9, beta_2=0.999, epsilon=1e-8,
decay=0.05, amsgrad=False)

model.compile(loss='mean_squared_error',optimizer=optimizer,metrics=[coeff_determination])
return model

以下是我编译模型的方式（10 层，每层 50 个隐藏单元）：

model = build_model(n_layers=10,input_dim=len(model_inputs),units=50,activation='relu')

我终于使用以下方法安装了我的模型：

history = model.fit(x=X_norm, y=Y_norm, epochs=225, batch_size = X_norm.shape[0])

并获得了最后一个时期的输出：

纪元 225/225 162/162 [========================] - 0s 86us/步 - 损失：1.2539e-04

我继续使用以下方法对训练数据集进行预测作为健全性检查：

Y_new = model.predict(X_new)

最后，我首先使用用于标准化目标列的均值和标准差对预测进行“去标准化”。

以下是一些观察结果的实际 vs 预测结果：

预言：

    target_1    target_2
0   82.485733   1.370714
1   80.422562   1.260928
4   81.192978   1.216307
5   67.609528   1.191042
6   71.330009   1.012126
7   75.563698   1.161521
8   80.212601   1.341668
9   76.349266   0.962544
10  81.197662   1.307808
12  79.872147   1.306849
13  82.837700   1.313502
14  73.760750   1.687881
16  75.290382   1.368603
18  70.717682   1.229838
19  76.409767   1.307446
20  85.307167   1.342816
21  70.818542   1.242142
22  78.467438   1.382735
23  75.320892   1.238882
25  89.502998   1.446245

实际情况：

    target_1  target_2
0   86.2700   1.243
1   81.2960   1.330
4   75.2020   1.309
5   73.8390   1.342
6   73.1020   1.296
7   79.8180   0.795
8   77.3300   1.612
9   78.4010   1.074
10  65.6410   1.457
12  83.9160   1.449
13  69.1829   1.166
14  88.9450   1.056
16  77.7440   1.392
18  71.8940   1.169
19  86.0040   1.306
20  79.8420   1.215
21  74.3650   0.977
22  66.3840   1.464
23  84.3200   1.295
25  75.8870   1.808

问题：

我觉得对于如此低的成本值，预测值和实际值应该更接近，特别是对于训练数据集，但也许我想错了。
规范化输入和目标列有什么缺点吗？
我是否正确执行了规范化？
我是否首先使用用于标准化它们的均值和标准偏差来正确地重新转换模型的输出？
我的设置对两个神经元回归 NN 有效吗？