在具有张量输入 X 的神经网络中,似乎有时它永远不会学习......为什么?

数据挖掘 神经网络 喀拉斯 张量流
2022-02-14 01:53:21
import numpy as np
import keras.models as km
import keras.layers as kl
import keras.optimizers as ko
import keras.losses as kloss

# This will cause no learning
np.random.seed(1692585618)

def f(x):
    a = x[0]* 3.141 + x[1]
    return a;

# Create a sample dataset 
# Input is (*, 2)
# Output is (*, )
x_train=np.array([[1,2], [3,4], [5,6]])
y_train=np.array([f(x) for x in x_train])

# These are required by the shape of x_train and y_train
in_dim = x_train.shape[1]
out_dim = 1

model = km.Sequential()
model.add(kl.Dense(units=3, activation='relu', input_shape=(in_dim,)))
model.add(kl.Dense(units=out_dim, activation='relu'))
model.compile(
  loss=kloss.mean_squared_error
, optimizer=ko.Adam(lr=0.1)
)

model.fit(x_train, y_train, epochs=500, batch_size=1, verbose=True)

输出:

Epoch 1/50
3/3 [==============================] - 1s 221ms/step - loss: 225.9046
Epoch 2/50
3/3 [==============================] - 0s 3ms/step - loss: 225.9046
Epoch 3/50
3/3 [==============================] - 0s 3ms/step - loss: 225.9046
Epoch 4/50
3/3 [==============================] - 0s 2ms/step - loss: 225.9046

...等等(数百个)

1个回答

简而言之

  • 你是waaaaay训练不足。增加向网络展示数据的次数。我猜训练可能需要比您预期的更长的时间,因为通常网络使用 0 均值数据训练得最好,而您的不是。
  • ReLU似乎会导致这种浅层网络出现问题。尝试增加深度或改用elu激活。
  • 我确认在最后一层激活不会造成大问题,但养成知道何时应该和不应该在最后一层激活的习惯仍然是一个好主意。

训练不足

我通过将训练示例的数量增加 10,000 倍来处理这个问题(您可以改为增加 epoch 的数量,但这会产生更好的打印效果):

x_train=np.array([[1,2], [3,4], [5,6]] * 10000)
y_train=np.array([f(x) for x in x_train])[:, np.newaxis]

问题ReLU

的问题ReLU可以通过以下两种方式之一处理,使用时增加层数ReLU,或使用不同的激活,例如elu. 两者都对我训练得很好:

model = km.Sequential()
model.add(kl.Dense(units=3, activation='relu', input_shape=(in_dim,)))
model.add(kl.Dense(units=3, activation='relu'))
model.add(kl.Dense(units=out_dim))
model.compile(
  loss=kloss.mean_squared_error, optimizer=ko.Adam(lr=0.1)
)

或者

model = km.Sequential()
model.add(kl.Dense(units=3, activation='elu', input_shape=(in_dim,)))
model.add(kl.Dense(units=out_dim))
model.compile(
  loss=kloss.mean_squared_error, optimizer=ko.Adam(lr=0.1)
)

完整的工作代码

下面显示了带有 的代码elu,您可以将块换成ReLU版本(如上所示),它会打印非常相似的值。

import numpy as np
import keras.models as km
import keras.layers as kl
import keras.optimizers as ko
import keras.losses as kloss

# This will cause no learning
np.random.seed(1692585618)

def f(x):
    a = x[0]* 3.141 + x[1]
    return a;

# Create a sample dataset 
# Input is (*, 2)
# Output is (*, )
x_train=np.array([[1,2], [3,4], [5,6]] * 10000)
y_train=np.array([f(x) for x in x_train])[:, np.newaxis]

# These are required by the shape of x_train and y_train
in_dim = x_train.shape[1]
out_dim = 1

model = km.Sequential()
model.add(kl.Dense(units=3, activation='elu', input_shape=(in_dim,)))
model.add(kl.Dense(units=out_dim))
model.compile(
  loss=kloss.mean_squared_error, optimizer=ko.Adam(lr=0.1)
)

model.fit(x_train, y_train, epochs=3, verbose=True)
print('predicted: {}'.format(model.predict(x_train)[:3, 0]))
print('actual   : {}'.format(y_train[:3, 0]))

印刷

Epoch 1/3
30000/30000 [==============================] - 1s 29us/step - loss: 3.0553
Epoch 2/3
30000/30000 [==============================] - 1s 20us/step - loss: 5.0199e-06
Epoch 3/3
30000/30000 [==============================] - 1s 20us/step - loss: 5.4414e-06
predicted: [ 5.1426897 13.420667  21.707573 ]
actual   : [ 5.141 13.423 21.705]