I've implememented a variant of Faster R-CNN and was wondering about the canonical way to stabilize the training of the (final) bounding box regression. At the moment, I have the issue, that the regression loss is 0 until the RPN is optimized enough to produce RoIs, that have IoU > 0.5 with ground truth boxes. Then, the regression loss spikes high (the regression head has not been optimized to this point because of 0 loss) and the large gradient is "destroying" the rest of the model. Then the next time, the IoU of RoIs with the ground truth will be below 0.5 again, resulting in a regression loss of 0 and it loops into these 2 scenarios.
What is the "canonial"/usual way to make this stable? I am currently overfitting on a single image.
Best
After some investigation and trial and error, I've found out that I had to detach the RoI proposal coordinates before feeding them into the head, to prevent the head loss from being used when optimizing the RPN layers. Additionally, it helps to inject ground truth boxes as RoI proposals before RoI-Pooling to stabilize the training.