我在使用神经网络方面还很陌生,并且认为我犯了一些基本错误。我正在尝试将模拟图像分配给 5 个类进行测试,哪些(如果有)网络有助于解决我们小组中的大数据问题。我正在用 40000 个(256 x 256 像素灰度)图像训练 Keras 中包含的 ResNet50 CNN。虽然训练损失在第一个时期内迅速改善,但验证损失以(对我而言)相当随机的方式剧烈波动。
我正在尝试使用尽可能多的高级函数,并最终得到以下代码:
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.resnet50 import ResNet50
from keras.models import Model
from keras.layers import GlobalAveragePooling2D, Dense
inputShape = (256, 256,1)
targetSize = (256, 256 )
batchSize = 128
# First, load resnet
base_model = ResNet50(include_top =False,
weights =None,
input_shape=inputShape)
x = base_model.output
# modified based on the article https://github.com/priya-dwivedi/
# Deep-Learning/blob/master/resnet_keras/Residual_Networks_yourself.ipynb
x = GlobalAveragePooling2D()(x)
predictions = Dense(5, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)
# Compile model, might want to change loss and metrics
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics = ["acc"]
)
# Define the generators to read the data. Training generator performs data argumentation as well
train_datagen = ImageDataGenerator(
samplewise_center =True,
samplewise_std_normalization = True,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(
samplewise_center =True,
samplewise_std_normalization = True,
)
train_generator = train_datagen.flow_from_directory(
'trainingData/FolderBasedTrain256',
target_size=targetSize,
color_mode = 'grayscale',
batch_size=batchSize,
save_to_dir = 'trainingData/argumentedImages',
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
'trainingData/FolderBasedValidation256',
target_size=targetSize,
batch_size=batchSize,
color_mode = 'grayscale',
class_mode='categorical')
# Fit the Modell, saving the history for later plotting
history = model.fit_generator(
train_generator,
epochs=20,
validation_data=validation_generator,
#steps_per_epoch = 62,
steps_per_epoch = 31,
#callbacks=[checkpointer],
validation_steps=9)
我总是可以创建更多图像或训练更长时间,但对我来说,这看起来好像是某个地方出了问题。我将非常感谢任何和所有的想法。谢谢!
编辑:有人告诉我要强调验证和训练集都是由完全相同的模拟程序创建的,所以它们应该相对容易分类
EDIT2:发现错误!我的批量大小与数据量和 epoch 的步数不匹配,导致 CNN 看不到所有的训练数据。现在一切都很好地收敛,我可以评估模型的选择。感谢所有贡献者
