连体网络在训练时返回相同的损失值

数据挖掘 Python 喀拉斯 张量流
2022-02-18 02:21:38

我有大约 9k 个训练样本,相同的对向量被标记为 0,而不是相同的对样本被标记为 1。

我在前 3 个 epoch 训练了 100 个 epoch,损失值在波动,然后它返回相同的损失值直到结束。

基础网络架构:

def create_base_network():

    a = 'tanh'
    model = Sequential()
    model.add(Dense(self.INPUT_DIM, input_shape=(self.INPUT_DIM, ), activation=a))
    model.add(Dense(600, activation=a))
    model.add(Dense(600, activation=a))
    model.add(Dense(900, activation=a))
    model.add(Dense(1000, activation=a))
    model.add(Dense(5000, activation=a))
    model.add(Dense(1000, activation=a))
    model.add(Dense(900, activation=a))
    model.add(Dense(600, activation=a))
    model.add(Dense(600, activation=a))
    model.add(Dense(self.INPUT_DIM, activation=a))
    return model

欧几里得距离:

K.sqrt(K.maximum(K.sum(K.square(x - y), axis=0, keepdims=True), K.epsilon()))

初始化模型:

    base_network = self._create_base_network()
    input_a = Input(shape=(self.INPUT_DIM,))
    input_b = Input(shape=(self.INPUT_DIM,))

    processed_a = base_network(input_a)
    processed_b = base_network(input_b)

    distance = Lambda(self._euclidean_distance, output_shape=self._dist_output_shape)([processed_a, processed_b])
    model = Model(inputs=[input_a, input_b], outputs=distance)

模型编译:

model.compile(loss='mse',
          metrics=['mse'],
          optimizer=optimizers.Adam()
          )

训练模型:

model.fit([self.train_d['vec1'], self.train_d['vec2']], self.train_d['label'], batch_size=128, epochs=self.args.epochs, shuffle=True)

日志语句:

Epoch 1/100
9544/9544 [==============================] - 6s 594us/step - loss: 0.4556 - mean_squared_error: 0.4556
Epoch 2/100
9544/9544 [==============================] - 4s 470us/step - loss: 1.0693 - mean_squared_error: 1.0693
Epoch 3/100
9544/9544 [==============================] - 4s 464us/step - loss: 0.7328 - mean_squared_error: 0.7328
Epoch 4/100
9544/9544 [==============================] - 4s 465us/step - loss: 0.7328 - mean_squared_error: 0.7328
Epoch 5/100
9544/9544 [==============================] - 4s 461us/step - loss: 0.7328 - mean_squared_error: 0.7328
Epoch 99/100
9544/9544 [==============================] - 4s 470us/step - loss: 0.7328 - mean_squared_error: 0.7328
Epoch 100/100
9544/9544 [==============================] - 4s 470us/step - loss: 0.7328 - mean_squared_error: 0.7328

有人可以帮我找出这里的问题吗?

提前致谢。

1个回答

最后,我找到了解决这个问题的方法。我完全修改了架构,然后我将学习率降低到 0.00005。然后模型的损失值不断下降。

最新架构:

a = 'tanh'
model = Sequential()
model.add(Dense(self.INPUT_DIM, input_shape=(self.INPUT_DIM, ), 
activation=a))
model.add(Dense(600, activation=a))
model.add(Dense(self.INPUT_DIM, activation=a))