I do not quite understand the RoIAlignRotated operation (from mmcv.ops https://mmcv.readthedocs.io/en/v1.3.4/_modules/mmcv/ops/roi_align_rotated.html?highlight=roialignrotated) / it does not what I expect it todo.
I expect it to work as describe in Oriented-RCNN (https://arxiv.org/abs/2108.05699). To test it, I've written the following script:
import cv2
import numpy as np
import torch
import math
from mmcv.ops import RoIAlignRotated
box = [[98.0, 97.0], [82.0, 103.0], [79.0, 98.0], [96.0, 91.0]]
box = cv2.minAreaRect(np.array(box).astype(np.int64))
box = torch.tensor([box[0][0], box[0][1], box[1][0], box[1][1], box[2]]).unsqueeze(0).to("cuda")
box = box.to(torch.int64).to(torch.float)
box[..., -1] = math.pi / 2 # set angle to pi / 2 (rad) for test purpose (see opencv def. https://github.com/open-mmlab/mmrotate/blob/main/docs/en/intro.md)
box = torch.cat((torch.zeros((1,1)).to("cuda"), box), dim=-1)
r_ = RoIAlignRotated(
output_size=(int(box[0,4].item()), int(box[0,3].item())),
spatial_scale=1,
sampling_ratio=0,
clockwise=True
)
image = cv2.imread("image.tif") # type: ignore
tens = torch.tensor(image).permute((2, 0, 1)).unsqueeze(0).to("cuda").float()
z = r_.forward(tens, box)
z = z.detach().clone().cpu()[0].permute((1, 2, 0)).to(torch.int64).numpy()
cv2.imwrite(f"test_roi_align.png", z) # type: ignore
I'm expecting it to extract and rotate the box from the image. As far as I understand, after the rotation, it is doing an Average Pooling, which I prevent by setting the output size of the RoIAlignRotated equal to the box size. However 'z' is not the expected image part, but just some random colors (see image at the end). Does somebody have an idea, what I'm understanding/doing wrong? Thank you!