我在 tensorflow 中实现了 unet,用于分割大腿的 MRI 图像。我注意到我总是以很小的差距获得更高的验证准确度,而与初始拆分无关。一个例子
所以我研究了什么时候可以做到这一点:
- 当我们有一个“简单”的验证集时。我对它进行了不同的初始分割训练,它们都显示出更高的验证准确度。
- 正则化和增强可能会降低训练的准确性。我删除了增强和丢失正则化,仍然观察到相同的差距,唯一的区别是需要更多的 epoch 才能达到收敛。
- 我发现的最后一件事是,在 keras 中,训练准确度和损失是在相应时期的每次迭代中平均的,而验证准确度和损失是在时期结束时从模型中计算出来的,这可能会使训练损失更高和准确性较低。
所以我认为如果我在同一个集合上训练和验证,那么我应该得到相同的曲线,但移动一个时期。所以我只训练了 2 个批次并在相同的 2 个批次上进行了验证(没有丢失或增加)。我仍然认为还有其他事情发生,因为它们看起来不太一样,至少在最后权重不再变化时,训练和验证准确度应该是相同的(但验证准确度仍然高出一个小差距) 这是情节
还有什么可以增加损失值的吗,这是我正在使用的模型
def unet_no_dropout(pretrained_weights=None, input_size=(512, 512, 1)):
inputs = tf.keras.layers.Input(input_size)
conv1 = tf.keras.layers.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(inputs)
conv1 = tf.keras.layers.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv1)
pool1 = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = tf.keras.layers.Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool1)
conv2 = tf.keras.layers.Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv2)
pool2 = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = tf.keras.layers.Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool2)
conv3 = tf.keras.layers.Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv3)
pool3 = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = tf.keras.layers.Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool3)
conv4 = tf.keras.layers.Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv4)
#drop4 = tf.keras.layers.Dropout(0.5)(conv4)
pool4 = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(conv4)
conv5 = tf.keras.layers.Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool4)
conv5 = tf.keras.layers.Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv5)
#drop5 = tf.keras.layers.Dropout(0.5)(conv5)
up6 = tf.keras.layers.Conv2D(512, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
tf.keras.layers.UpSampling2D(size=(2, 2))(conv5))
merge6 = tf.keras.layers.concatenate([conv4, up6], axis=3)
#merge6 = tf.keras.layers.concatenate([conv4, up6], axis=3)
conv6 = tf.keras.layers.Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge6)
conv6 = tf.keras.layers.Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv6)
up7 = tf.keras.layers.Conv2D(256, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
tf.keras.layers.UpSampling2D(size=(2, 2))(conv6))
merge7 = tf.keras.layers.concatenate([conv3, up7], axis=3)
conv7 = tf.keras.layers.Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge7)
conv7 = tf.keras.layers.Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv7)
up8 = tf.keras.layers.Conv2D(128, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
tf.keras.layers.UpSampling2D(size=(2, 2))(conv7))
merge8 = tf.keras.layers.concatenate([conv2, up8], axis=3)
conv8 = tf.keras.layers.Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge8)
conv8 = tf.keras.layers.Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv8)
up9 = tf.keras.layers.Conv2D(64, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
tf.keras.layers.UpSampling2D(size=(2, 2))(conv8))
merge9 = tf.keras.layers.concatenate([conv1, up9], axis=3)
conv9 = tf.keras.layers.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge9)
conv9 = tf.keras.layers.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
conv9 = tf.keras.layers.Conv2D(2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
conv10 = tf.keras.layers.Conv2D(1, 1, activation='sigmoid')(conv9)
model = tf.keras.Model(inputs=inputs, outputs=conv10)
model.compile(optimizer = Adam(lr = 2e-4), loss = 'binary_crossentropy', metrics = [tf.keras.metrics.Accuracy()])
#model.compile(optimizer=tf.keras.optimizers.Adam(2e-4), loss=combo_loss(alpha=0.2, beta=0.4), metrics=[dice_accuracy])
#model.compile(optimizer=RMSprop(lr=0.00001), loss=combo_loss, metrics=[dice_accuracy])
if (pretrained_weights):
model.load_weights(pretrained_weights)
return model
这就是我保存模型的方式
model_checkpoint = tf.keras.callbacks.ModelCheckpoint('unet_ThighOuterSurfaceval.hdf5',monitor='val_loss', verbose=1, save_best_only=True)
model_checkpoint2 = tf.keras.callbacks.ModelCheckpoint('unet_ThighOuterSurface.hdf5', monitor='loss', verbose=1, save_best_only=True)
model = unet_no_dropout()
history = model.fit(genaug, validation_data=genval, validation_steps=len(genval), steps_per_epoch=len(genaug), epochs=80, callbacks=[model_checkpoint, model_checkpoint2])