使用 VGG16 分类的 Cifar10 保持显示相同的输出?

数据挖掘 深度学习 分类 喀拉斯 卷积神经网络 模型选择
2022-03-13 23:45:09

我目前正在尝试使用 Keras 上的 vgg16 网络对 cifar10 数据进行分类,但似乎得到了非常糟糕的结果,我不太清楚......

vgg16 设计用于对 1000 个类问题进行分类。为什么它不适用于像 cifar10 这样的小问题设置?

代码:

from keras.utils import np_utils

from keras import metrics
import keras
from keras import backend as K
from keras.layers import Conv1D,Conv2D,MaxPooling2D, MaxPooling1D, Reshape
from keras.models import Model
from keras.layers import Input, Dense
import tensorflow as tf
from keras.datasets import mnist,cifar10



WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'

batch_size = 42
num_classes = 10
epochs = 100

# input image dimensions
img_rows, img_cols = 32, 32

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

#print('x_train shape:', x_train.shape)
#print(x_train.shape[0], 'train samples')
#print(x_test.shape[0], 'test samples')

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 3)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 3)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

x_train /= 255
x_test /= 255



y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)


print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

print('y_train shape:', y_train.shape)
print(y_train.shape, 'train samples')
print(y_test.shape, 'test samples')
def fws():
    #print "Inside"
    #   Params:
    #   batch ,  lr, decay , momentum, epochs
    #
    #Input shape: (batch_size,40,45,3)
    #output shape: (1,15,50)
    # number of unit in conv_feature_map = splitd
    input = Input(shape=(img_rows,img_cols,3))
    zero_padded_section = keras.layers.convolutional.ZeroPadding2D(padding=(96,96), data_format='channels_last')(input)
    print zero_padded_section
    model = keras.applications.vgg16.VGG16(include_top = False,
                    weights = 'imagenet',
                    input_shape = (224,224,3),
                    pooling = 'max',
                    classes = 10)

    model_output = model(input)
    print model_output

    #FC
    dense1 = Dense(units = 4000, activation = 'relu',    name = "dense_1")(model_output)
    dense2 = Dense(units = 4000, activation = 'relu',    name = "dense_2")(dense1)
    dense3 = Dense(units = 10 , activation = 'softmax', name = "dense_3")(dense1)


    model = Model(inputs = input , outputs = dense3)
    #sgd = SGD(lr=0.08,decay=0.025,momentum = 0.99,nesterov = True)
    model.compile(loss="categorical_crossentropy", optimizer='adam' , metrics = [metrics.categorical_accuracy])

    model.fit(x_train[:500], y_train[:500],
              batch_size=batch_size,
              epochs=epochs,
              verbose=1,
              validation_data=(x_test[:10], y_test[:10]))
    score = model.evaluate(x_test, y_test, verbose=0)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])


fws()

当前结果:

500/500 [==============================] - 166s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 36/100
500/500 [==============================] - 163s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 37/100
500/500 [==============================] - 166s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 38/100
500/500 [==============================] - 168s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 39/100
500/500 [==============================] - 167s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 40/100
500/500 [==============================] - 163s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 41/100
500/500 [==============================] - 167s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 42/100
500/500 [==============================] - 165s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 43/100
500/500 [==============================] - 178s - loss: 14.6030 - categorical_accuracy: 0.0940 - val_loss: 12.8945 - val_categorical_accuracy: 0.2000
Epoch 44/100
462/500 [==========================>...] - ETA: 16s - loss: 14.6179 - categorical_accuracy: 0.0931 

并且仍然不断输出相同的输出......

1个回答

Keras VGG16 文档中它说:

    input_shape: optional shape tuple, only to be specified
        if `include_top` is False (otherwise the input shape
        has to be `(224, 224, 3)` (with `channels_last` data format)
        or `(3, 224, 224)` (with `channels_first` data format).
        It should have exactly 3 inputs channels,
        and width and height should be no smaller than 48.
        E.g. `(200, 200, 3)` would be one valid value.

因此,我会担心您的代码中在数据形状和输入形状方面发生了什么。因此,我会进一步担心 vgg16 对 32x32 cifar 图像的适用性。