训练准确度大于验证准确度

数据挖掘 张量流 数据集 准确性 过拟合
2022-02-20 09:20:29

我面临的问题是我的模型的训练准确度远高于验证准确度,谈论的是0.2. 我不明白为什么,但在这方面我还是个新手,所以请多多包涵。

数据来自使用 创建的两个数据集f.data.Dataset,一个用于训练另一个用于验证,因为这就是数据集的文件夹布局。

model = keras.Sequential([
    keras.layers.Flatten(),
    keras.layers.Dense(256, activation='relu'),
    keras.layers.Dropout(.1),
    keras.layers.Dense(2, activation="softmax")
])

model.compile(optimizer='adam',
            loss="categorical_crossentropy",
            metrics=['accuracy'])

model.fit(train_ds, steps_per_epoch=STEPS_PER_EPOCH, epochs=10, validation_data=test_ds, validation_steps=VALIDATION_STEPS)
Train for 163.0 steps, validate for 20.0 steps
Epoch 1/10
163/163 [==============================] - 3s 21ms/step - loss: 3.9965 - accuracy: 0.8468 - val_loss: 0.3582 - val_accuracy: 0.8406
Epoch 2/10
163/163 [==============================] - 3s 19ms/step - loss: 0.3197 - accuracy: 0.8930 - val_loss: 0.5207 - val_accuracy: 0.7641
Epoch 3/10
163/163 [==============================] - 3s 19ms/step - loss: 0.2009 - accuracy: 0.9191 - val_loss: 0.4350 - val_accuracy: 0.8062
Epoch 4/10
163/163 [==============================] - 3s 19ms/step - loss: 0.1815 - accuracy: 0.9270 - val_loss: 0.5521 - val_accuracy: 0.7516
Epoch 5/10
163/163 [==============================] - 3s 19ms/step - loss: 0.2122 - accuracy: 0.8986 - val_loss: 0.9616 - val_accuracy: 0.7156
Epoch 6/10
163/163 [==============================] - 3s 19ms/step - loss: 0.2405 - accuracy: 0.9082 - val_loss: 1.2039 - val_accuracy: 0.7109
Epoch 7/10
163/163 [==============================] - 3s 19ms/step - loss: 0.2013 - accuracy: 0.9183 - val_loss: 0.7242 - val_accuracy: 0.6406
Epoch 8/10
163/163 [==============================] - 3s 19ms/step - loss: 0.2055 - accuracy: 0.9176 - val_loss: 0.4830 - val_accuracy: 0.6891
Epoch 9/10
163/163 [==============================] - 3s 19ms/step - loss: 0.1901 - accuracy: 0.9250 - val_loss: 0.3925 - val_accuracy: 0.8313
Epoch 10/10
163/163 [==============================] - 3s 19ms/step - loss: 0.1861 - accuracy: 0.9202 - val_loss: 0.5492 - val_accuracy: 0.8000

任何人都可以向我解释一下是什么导致了准确性和 val_accuracy 之间的巨大差距,好吗?

1个回答

您的模型显然过度拟合。您应该使用更高的 dropout 值,例如 0.5 。为了更好的泛化,请使用深度模型。您还可以使用提前停止,以便您的模型在严重过度拟合之前停止训练。