我正在尝试使用 Keras 构建一个模型,该模型可以从覆盖大约 10-30% 图像的显微镜噪声图像中预测四类特征。我使用 U-net 是因为我的数据集很小(150 张图像用于训练,30 张图像用于验证)。作为指标,我使用准确率、损失、交叉联合和骰子系数,在 100 个训练阶段后得到以下结果:
loss: 0.0518 - accuracy: 0.9555 - dice_coef: 0.9480 - iou_coef: 0.9038 - val_loss: 0.0922 - val_accuracy: 0.9125 - val_dice_coef: 0.9079 - val_iou_coef: 0.8503
不幸的是,当我根据上述指标显示原始图像和预测图像时,它们之间的匹配度不如我预期的那样,而似乎无法识别类之间的差异。
那可能吗 ?以下是我构建的指标和 U-net 模型的代码:
# IOU metric
def iou_coef(y_true, y_pred, smooth=1):
intersection = K.sum(K.abs(y_true * y_pred), axis=[1,2,3])
union = K.sum(y_true,[1,2,3])+K.sum(y_pred,[1,2,3])-intersection
iou = K.mean((intersection + smooth) / (union + smooth), axis=0)
return iou
def dice_coef(y_true, y_pred, smooth=1):
intersection = K.sum(y_true * y_pred, axis=[1,2,3])
union = K.sum(y_true, axis=[1,2,3]) + K.sum(y_pred, axis=[1,2,3])
dice = K.mean((2. * intersection + smooth)/(union + smooth), axis=0)
return dice
def unet(pretrained_weights=None, input_size=(IMG_SIZE, IMG_SIZE, 3),num_class=2):
inputs = Input(input_size)
conv1 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(inputs)
conv1 = BatchNormalization()(conv1)
conv1 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv1)
conv1 = BatchNormalization()(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool1)
conv2 = BatchNormalization()(conv2)
conv2 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv2)
conv2 = BatchNormalization()(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool2)
conv3 = BatchNormalization()(conv3)
conv3 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv3)
conv3 = BatchNormalization()(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool3)
conv4 = BatchNormalization()(conv4)
conv4 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv4)
conv4 = BatchNormalization()(conv4)
drop4 = Dropout(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
conv5 = Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool4)
conv5 = BatchNormalization()(conv5)
conv5 = Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv5)
conv5 = BatchNormalization()(conv5)
drop5 = Dropout(0.5)(conv5)
up6 = Conv2D(512, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
UpSampling2D(size=(2, 2))(drop5))
up6 = BatchNormalization()(up6)
merge6 = concatenate([drop4, up6], axis=3)
conv6 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge6)
conv6 = BatchNormalization()(conv6)
conv6 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv6)
conv6 = BatchNormalization()(conv6)
up7 = Conv2D(256, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
UpSampling2D(size=(2, 2))(conv6))
up7 = BatchNormalization()(up7)
merge7 = concatenate([conv3, up7], axis=3)
conv7 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge7)
conv7 = BatchNormalization()(conv7)
conv7 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv7)
conv7 = BatchNormalization()(conv7)
up8 = Conv2D(128, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
UpSampling2D(size=(2, 2))(conv7))
up8 = BatchNormalization()(up8)
merge8 = concatenate([conv2, up8], axis=3)
conv8 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge8)
conv8 = BatchNormalization()(conv8)
conv8 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv8)
conv8 = BatchNormalization()(conv8)
up9 = Conv2D(64, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
UpSampling2D(size=(2, 2))(conv8))
up9 = BatchNormalization()(up9)
merge9 = concatenate([conv1, up9], axis=3)
conv9 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge9)
conv9 = BatchNormalization()(conv9)
conv9 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
conv9 = BatchNormalization()(conv9)
conv9 = Conv2D(num_class, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
conv9 = BatchNormalization()(conv9)
conv10 = Conv2D(num_class, 1, activation='softmax')(conv9)
model = Model(inputs, conv10)
# Compile the model
model.compile(loss=dice_loss, optimizer=Adam(lr=1e-4), metrics=['accuracy', dice_coef, iou_coef])
if (pretrained_weights):
model.load_weights(pretrained_weights)
return model
更新
感谢有用的答案和更多的研究,我认为问题在于验证数据(46 张图像),因为 100 个时期的学习曲线和 0.0001 的学习率太嘈杂:
我试图打乱数据并降低学习率来遇到问题。因此,我以 0.00001 和 0.000001 的学习率重新运行模型,但学习率较小,而验证损失和准确性的噪声较小,验证 IOU 和骰子系数在所有时期都停留在 30%。
值得一提的是,我使用提前停止回调,将“准确度”作为监控参数,但“准确度”在所有时期都增加了,因此没有提前停止。如果我从验证中设置任何指标,模型会在大约 7 个时期停止,而没有有用的结果。
有什么建议吗?
更新
经过更多的研究和实验,我认为影响结果的主要问题之一是批量大小。我将批量大小增加到 9 级,并将输入大小从 512、512 --> 256、256 减小,同时我也将学习率降低到 0.001 级。结果,验证曲线波动在第一个时期受到限制。具体来说,我使用以下参数重新运行模型:
- 输入大小:256、256、3
- 批量:9
- 训练数据集大小:189
- 验证数据集大小:46
- 学习率:0.001
- epochs:64(提前停止)
最后的结果是:
损失:0.0813 - 准确度:0.9320 - dice_coef:0.9182 - iou_coef:0.8532 - val_loss:0.0719 - val_accuracy:0.9365 - val_dice_coef:0.9273 - val_iou_coef:0.8696
但是,当我根据最终实验的指标显示原始图像和预测图像时,再次没有像我预期的那样匹配。
模型仍然过拟合吗?为什么会这样?
更新
在@fswings 的推荐下,我重新拆分了数据,在训练和验证中保持了相同数量的类。我的最终数据集大小是 125 个用于训练的图像和 34 个用于验证的图像。其他参数与上面上次更新完全相同:
- 输入大小:256、256、3
- 批量:9
- 训练数据集大小:125
- 验证数据集大小:34
- 学习率:0.001
- epochs:44(提前停止)
最后一个 epoch 的指标是:损失:0.0969 - 准确度:0.9350 - dice_coef:0.9022 - iou_coef:0.8279 - val_loss:0.0818 - val_accuracy:0.9299 - val_dice_coef:0.9175 - val_iou_coef:0.8537
这看起来相当不错,但验证损失和准确度曲线仍然非常嘈杂:
虽然与上述上次更新相比,预测结果要差得多。
有什么建议吗?



