要了解 RPN 是如何进行训练的,我们可以深入研究由 Matterport 编写的代码,该代码有 10,000 次凝视和 tf/keras 实现 Mask R-CNN 存储库。
您可以在mrcnn/model.py中查看build_rpn_targets函数
如果我们使用生成的anchors(取决于你的anchor scales、ratio、image size ...)来计算anchors和ground truth的IOU,
# Compute overlaps [num_anchors, num_gt_boxes]
overlaps = utils.compute_overlaps(anchors, gt_boxes)
我们可以知道锚点和地面实况之间的重叠情况。然后我们根据它们与ground truth的IOU来选择正锚和负锚。根据 Mask R-CNN 论文,IOU > 0.7 为正锚,< 0.3 为负锚,否则为中性锚,训练时不使用
# 1. Set negative anchors first. They get overwritten below if a GT box is
# matched to them.
anchor_iou_argmax = np.argmax(overlaps, axis=1)
anchor_iou_max = overlaps[np.arange(overlaps.shape[0]), anchor_iou_argmax]
rpn_match[anchor_iou_max < 0.3] = -1
# 2. Set an anchor for each GT box (regardless of IoU value).
# If multiple anchors have the same IoU match all of them
gt_iou_argmax = np.argwhere(overlaps == np.max(overlaps, axis=0))[:,0]
rpn_match[gt_iou_argmax] = 1
# 3. Set anchors with high overlap as positive.
rpn_match[anchor_iou_max >= 0.7] = 1
为了有效地训练 RPN,您需要仔细设置 RPN_TRAIN_ANCHORS_PER_IMAGE 以在一张图像中的对象很少时平衡训练。请注意,可以有多个锚点匹配一个基本事实,因为我们可以为每个锚点提供 bbox 偏移以适应基本事实。
希望答案对你来说很清楚!