I'm trying to learn about stereo calibration using synthetic data rendered in Blender. My scene setup looks as follows:
This allows me to extract pixel-perfect keypoint correspondences. I'm simulating a gradual (but extreme) decalibration of the cameras, i.e. the cameras rotate out of an initially perfectly front-parallel configuration (see video).
However, if I try to rectify my images with these perfect keypoint correspondences using OpenCV, I get rectified images that, invididually are straight, i.e. the horizon line looks horizontal, but the epipolar lines between the images are not horizontal, i.e. the lines connecting the projections of the same scene point in both images are not horizontal.
I am assuming known camera intrinsics but unknown extrinsics, i.e. the extrinsics are derived from the (pixel-perfect) keypoint correspondences.
My code looks as follows:
DATA_DIR = "../data/street_sunset_motionblur"
FRAME = 149
# Load images
left_rgb_path = f"{DATA_DIR}/rgb/left/street_sunset_motionblur_{FRAME:04d}_L.tif"
right_rgb_path = f"{DATA_DIR}/rgb/right/street_sunset_motionblur_{FRAME:04d}_R.tif"
img_left = cv2.imread(left_rgb_path)
img_right = cv2.imread(right_rgb_path)
# Convert from BGR to RGB
img_left = cv2.cvtColor(img_left, cv2.COLOR_BGR2RGB)
img_right = cv2.cvtColor(img_right, cv2.COLOR_BGR2RGB)
# Load keypoint correspondences.
# Correspondences array has shape (N, 5) where each row is a keypoint and each
# row has the format [x_left, y_left, x_right, y_right, keypoint_id])
keypoints_path = (f"{DATA_DIR}/keypoints/correspondences_{FRAME}.npy")
correspondences = np.load(keypoints_path)
# Split into left points [x_left, y_left] and right points [x_right, y_right]
points_left = correspondences[:, :2]
points_right = correspondences[:, 2:4]
# Load intrinsics matrices
K_left = np.load(f"{DATA_DIR}/camera_matrices/cam_intrinsic_matrix_{FRAME}_L.npy")
K_right = np.load(f"{DATA_DIR}/camera_matrices/cam_intrinsic_matrix_{FRAME}_R.npy")
# Step 1: Compute the Fundamental Matrix
F, mask = cv2.findFundamentalMat(points_left, points_right, cv2.FM_LMEDS)
# Step 2: Compute the Essential Matrix
E = K_right.T @ F @ K_left
# Step 3: Decompose the Essential Matrix to obtain the rotation and translation
_, R, t, _ = cv2.recoverPose(E, points_left, points_right, K_left)
# Specify image size in (width, height) format
image_size = (img_left.shape[1], img_left.shape[0])
# Step 4: Compute the rectification transforms
R1, R2, P1, P2, _, _, _ = cv2.stereoRectify(
cameraMatrix1=K_left,
distCoeffs1=None,
cameraMatrix2=K_right,
distCoeffs2=None,
imageSize=image_size,
R=R,
T=t,
flags=cv2.CALIB_ZERO_DISPARITY,
alpha=-1,
)
# Step 5: Compute the rectification maps
map1_x, map1_y = cv2.initUndistortRectifyMap(K_left, None, R1, P1, image_size, cv2.CV_32FC1)
map2_x, map2_y = cv2.initUndistortRectifyMap(K_right, None, R2, P2, image_size, cv2.CV_32FC1)
# Step 6: Apply the rectification maps
rectified_left = cv2.remap(img_left, map1_x, map1_y, cv2.INTER_LINEAR)
rectified_right = cv2.remap(img_right, map2_x, map2_y, cv2.INTER_LINEAR)
Any idea what I'm doing wrong, or what I'm missing? Given perfect keypoint matches and intrinsics, shouldn't I be able to get rectified images where a scene point shows up on the same pixel row in both images?


