机器算法验证 - 2D 中的 Spatial Dropout 是如何实现的？ - 吾爱随笔录

2D 中的 Spatial Dropout 是如何实现的？

机器算法验证机器学习深度学习张量流辍学

2022-02-03 21:41:57

这是参考论文Efficient Object Localization Using Convolutional Networks，据我所知，dropout 是在 2D 中实现的。

在阅读了 Keras 关于如何实现 Spatial 2D Dropout 的代码后，基本上实现了形状为 [batch_size, 1, 1, num_channels] 的随机二进制掩码。然而，这个空间 2D Dropout 究竟对形状为 [batch_size, height, width, num_channels] 的输入卷积块做了什么？

我目前的猜测是，对于每个像素，如果像素的任何层/通道具有负值，则该像素的整个通道将默认为零。它是否正确？

但是，如果我的猜测是正确的，那么如何使用形状为 [batch_size, height, width, num_channels] 的二进制掩码恰好在原始输入块的维度中给出通常的逐元素丢失（这是根据tensorflow 的原始 dropout 实现，将二进制掩码的形状设置为输入的形状）？因为这意味着如果 conv 块中的任何像素为负数，则整个 conv 块将默认为 0。这是我不太明白的令人困惑的部分。

2个回答

这个回复有点晚了，但我需要自己解决这个问题，并认为它可能会有所帮助。

查看论文，似乎在 Spatial Dropout 中，我们将整个特征图（也称为通道）随机设置为 0，而不是单个“像素”。

他们说的是有道理的，因为相邻像素高度相关，因此常规 dropout 在图像上效果不佳。因此，如果您随机隐藏像素，我仍然可以通过查看相邻像素来很好地了解它们是什么。丢弃整个特征图可能更符合丢弃的初衷。

这是一个基于 tf.nn.dropout 在 Tensorflow 中实现的函数。与 tf.nn.dropout 的唯一真正变化是我们的 dropout 掩码的形状是 BatchSize * 1 * 1 * NumFeatureMaps，而不是 BatchSize * Width * Height * NumFeatureMaps

def spatial_dropout(x, keep_prob, seed=1234):
    # x is a convnet activation with shape BxWxHxF where F is the 
    # number of feature maps for that layer
    # keep_prob is the proportion of feature maps we want to keep

    # get the batch size and number of feature maps
    num_feature_maps = [tf.shape(x)[0], tf.shape(x)[3]]

    # get some uniform noise between keep_prob and 1 + keep_prob
    random_tensor = keep_prob
    random_tensor += tf.random_uniform(num_feature_maps,
                                       seed=seed,
                                       dtype=x.dtype)

    # if we take the floor of this, we get a binary matrix where
    # (1-keep_prob)% of the values are 0 and the rest are 1
    binary_tensor = tf.floor(random_tensor)

    # Reshape to multiply our feature maps by this tensor correctly
    binary_tensor = tf.reshape(binary_tensor, 
                               [-1, 1, 1, tf.shape(x)[3]])
    # Zero out feature maps where appropriate; scale up to compensate
    ret = tf.div(x, keep_prob) * binary_tensor
    return ret

希望有帮助！

我目前的猜测是，对于每个像素，如果像素的任何层/通道具有负值，则该像素的整个通道将默认为零。它是否正确？

我不确定您在这里的确切含义，但是无论除了为 dropout 掩码随机绘制的值之外的任何值，都会发生 dropout。也就是说 dropout 不受像素值、过滤器权重或特征图值的影响。如果您使用大小的掩码，[batch_size, 1, 1, num_channels]您将在 dropout 期间获得此大小的二进制掩码。该二进制掩码中的零是有可能出现的rate（至少在 Keras 实现中，Dropout层的第一个参数）。然后将此掩码乘以您的特征图，因此无论哪个掩码维度的大小为 1 - 广播该掩码维度以匹配您的特征图形状。
想象一个更简单的情况 - 假设您有大小的特征图[height, num_channels]（现在让我们忽略批量大小）并且您的特征图值是：

print(feature_maps)

[[2 1 4]
 [1 3 2]
 [5 2 6]
 [2 2 1]]

print(feature_maps.shape)

(4, 3)

然后想象一个 size 的二进制 dropout 掩码[1, num_channels]，如下所示：

print(dropout_mask)

[[0 1 0]]

print(dropout_mask.shape)

(1, 3)

现在注意当你乘以feature_maps和时会发生什么dropout_mask：

print(feature_maps * dropout_mask)

[[0 1 0]
 [0 3 0]
 [0 2 0]
 [0 2 0]]

广播中的值dropout_mask以匹配每个特征图的高度，然后执行逐个元素的乘法。结果，整个特征图都归零了——这正是空间 dropout 所做的。

其它你可能感兴趣的问题

上一篇卷积网络的通用逼近定理下一篇lmer 模型使用哪种多重比较方法：lsmeans 还是 glht？