实施 BatchNormalization 后 CNN 性能不佳

数据挖掘 神经网络 美国有线电视新闻网 批量标准化
2022-02-14 02:18:38

我正在训练一个 CNN 来对名为 Malimg 的数据集中的恶意软件图像进行分类。在实施 BatchNormalization 层之前,我得到了 95.57% 的准确度(参见下面的损失/准确度和验证损失/准确度图表): 准确性 损失

Epoch 1/10
6537/6537 [==============================] - 53s 8ms/step - loss: 1.7711 - accuracy: 0.4605 - val_loss: 1.0062 - val_accuracy: 0.6510
Epoch 2/10
6537/6537 [==============================] - 52s 8ms/step - loss: 0.8739 - accuracy: 0.7150 - val_loss: 0.4965 - val_accuracy: 0.8426
Epoch 3/10
6537/6537 [==============================] - 52s 8ms/step - loss: 0.5163 - accuracy: 0.8406 - val_loss: 0.3061 - val_accuracy: 0.9136
Epoch 4/10
6537/6537 [==============================] - 54s 8ms/step - loss: 0.3656 - accuracy: 0.8897 - val_loss: 0.1989 - val_accuracy: 0.9408
Epoch 5/10
6537/6537 [==============================] - 53s 8ms/step - loss: 0.3063 - accuracy: 0.9016 - val_loss: 0.1822 - val_accuracy: 0.9490
Epoch 6/10
6537/6537 [==============================] - 52s 8ms/step - loss: 0.2657 - accuracy: 0.9166 - val_loss: 0.1886 - val_accuracy: 0.9472
Epoch 7/10
6537/6537 [==============================] - 52s 8ms/step - loss: 0.2366 - accuracy: 0.9237 - val_loss: 0.1618 - val_accuracy: 0.9536
Epoch 8/10
6537/6537 [==============================] - 52s 8ms/step - loss: 0.2071 - accuracy: 0.9315 - val_loss: 0.1341 - val_accuracy: 0.9615
Epoch 9/10
6537/6537 [==============================] - 52s 8ms/step - loss: 0.2017 - accuracy: 0.9330 - val_loss: 0.1424 - val_accuracy: 0.9618
Epoch 10/10
6537/6537 [==============================] - 51s 8ms/step - loss: 0.1882 - accuracy: 0.9362 - val_loss: 0.1425 - val_accuracy: 0.9557
2802/2802 [==============================] - 9s 3ms/step

实施 BatchNormalization 后,我的结果很差,模型到处都是(见下面的结果)。这有什么原因吗?我知道 BatchNormalization 做了什么,它稳定了学习过程,可以帮助神经网络的性能,但我被建议实施它。

下面是(神经网络的)代码和实现 BatchNormalization 后的结果:

num_classes = 25 #the amount of outputs in the output class
model-x = Sequential()
model-x.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(64,64,3)))
model-x.add(MaxPooling2D(pool_size=(2, 2)))
model-x.add(BatchNormalization())
model-x.add(Conv2D(16, (3, 3), activation='relu'))
model-x.add(MaxPooling2D(pool_size=(2, 2)))
model-x.add(Dropout(0.25))
model-x.add(Flatten())
model-x.add(Dense(128, activation='relu'))
model-x.add(Dropout(0.5))
model-x.add(Dense(50, activation='relu'))
model-x.add(Dense(num_classes, activation='softmax'))
model-x.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])


y_train_new = np.argmax(y_train, axis=1)
class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train_new), y_train_new)

history = model-x.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10,  class_weight=class_weights)
scores = model-x.evaluate(X_test, y_test)

model-x.summary()

准确性 - 在批处理规范之后 损失 - 批处理规范后

建议后更新结果 准确性 - 建议后 建议后的损失

建议后的代码

num_classes = 25 #the amount of outputs in the output class

model-x = Sequential()

model-x.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(64,64,3)))
model-x.add(BatchNormalization())
model-x.add(MaxPooling2D(pool_size=(2, 2)))
model-x.add(Conv2D(16, (3, 3), activation='relu'))
model-x.add(BatchNormalization())
model-x.add(MaxPooling2D(pool_size=(2, 2)))
model-x.add(Dropout(0.25))
model-x.add(Flatten())
model-x.add(Dense(128, activation='relu'))
model-x.add(Dropout(0.5))
model-x.add(Dense(50, activation='relu'))
model-x.add(Dense(num_classes, activation='softmax'))
model-x.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])

y_train_new = np.argmax(y_train, axis=1)
class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train_new), y_train_new)

history = model-x.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10,  class_weight=class_weights)
scores = model-x.evaluate(X_test, y_test)

model-x.summary()
1个回答

您在 MaxPool 之后使用 BatchNorm。它总是在卷积层而不是池化层之后应用。因此,您的网络就是这样运行的。

将 BatchNorm 放在所有 Conv 层之后。