What I know is that stereo matching recovers depth from disparity map using the formula depth = focal_length * baseline / disparity. And unsupervised monocular depth estimation can only recover relative depth which sounds reasonable.
But if unsupervised stereo matching can only recover relative depth, does it mean unsupervised stereo matching can only recover relative disparity from two images? So (x,y) -> (x+d, y) can also be estimated as (x,y) -> (x+n*d, y)?