将输入层添加到预训练模型

数据挖掘 Python 喀拉斯
2021-09-30 12:17:16

我正在尝试将形状 (1036800,) 的 numpy 数组 - 最初是形状 (480, 720, 3) 的图像 - 输入到预训练的 VGG16 模型中以预测连续值。

我尝试了以下代码的几种变体:

input = Input(shape=(1036800,), name='image_input')
initial_model = VGG16(weights='imagenet', include_top=False)
x = Flatten()(initial_model(input).output)
x = Dense(200, activation='relu')(x)
x = Dense(1)(x)
model = Model(inputs=input, outputs=x)

上述代码的先前变体产生了与输入尺寸错误相关的错误,input_shape需要有 3 个通道(在 VGG16 的初始化中使用 (1036800,) 作为该参数时),以及运行上述代码导致的最新错误代码是这样的:

Traceback (most recent call last):
  File "model_alex.py", line 57, in <module>
    model = initialize_model()
  File "model_alex.py", line 20, in initialize_model
    x = Flatten()(initial_model(input).output)
  File "/home/aicg2/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 596, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/aicg2/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2061, in call
    output_tensors, _, _ = self.run_internal_graph(inputs, masks)
  File "/home/aicg2/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2212, in run_internal_graph
    output_tensors = _to_list(layer.call(computed_tensor, **kwargs))
  File "/home/aicg2/.local/lib/python2.7/site-packages/keras/layers/convolutional.py", line 164, in call
    dilation_rate=self.dilation_rate)
  File "/home/aicg2/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 3156, in conv2d
    data_format='NHWC')
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 639, in convolution
    input_channels_dim = input.get_shape()[num_spatial_dims + 1]
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 500, in __getitem__
    return self._dims[key]
IndexError: list index out of range

这是完整的代码这是脚本中使用的示例数据文件

解决此问题的一种可能方法是将原始图像文件的大小调整为 224x224 并将它们转换为形状 (224, 224, 3) 的 numpy 数组,以便将它们插入预训练模型的第一层。但是,当我应该已经在训练时,我不想扭曲图像或浪费另一个夜间预处理数据。

除此之外,我所能想到的就是谷歌搜索我的问题并尝试调整找到的解决方案或漫无目的地调整各种与形状相关的参数和功能——在过去的 4 个小时里,这些都没有让我走得太远。

1个回答

问题是您不应该将图像展平为一维向量,因为 VGG16 包含 2D 卷积层(例如图像上的空间卷积),这需要输入的形状为(number_of_images, image_height, image_width, image_channels),假设keras.backend.image_data_format()返回'channels_last'如果您image_data_format'channels_first',请将输入数据形状更改为(number_of_images, image_channels, image_height, image_width).

这是您的固定代码(使用 Keras 2.0.4 测试):

x_train = x_train.reshape((x_train.shape[0], 480, 720, 3))
x_test = x_test.reshape((x_test.shape[0], 480, 720, 3))

initial_model = VGG16(weights='imagenet', include_top=False)
input = Input(shape=(480, 720, 3), name='image_input')
x = Flatten()(initial_model(input))
x = Dense(200, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(1)(x)
model = Model(inputs=input, outputs=x)
model.compile(loss='mse', optimizer='adam')

model.fit(x_train, y_train, epochs=20, batch_size=16)
score = model.evaluate(x_test, y_test, batch_size=16)