My problem is: Now i can get the dense optical flow for the video. The object and background are moving at the same time due to the rotation of the camera, therefore both the target and backgroud have OF results. Looking for a way to eliminate backgrounds.
1.The camera I'm using rotates at a fix speed when recording. Is there any way to relate the velocity of the camera to the OF of the background? So i can tell background flow from foreground flow based on the parameter of the camera? Is depth informaton necessary? 2.The camera(event camera) i'm using outputs binary images, which makes it harder to tell when target and background mixed up.
I tried put a threshold for the optical flow for all the pixels. That is, mask off all the pixels that have relatively low OF values. But the problem is when the target(a small drone) is moving slowly, this doesn't work, and this method can't eliminate all backgrounds.
Background subtraction is a category of low level algorithms. Don't expect too much out of that. You'll have to deploy some computationally expensive operations. You'll basically have to stabilize the footage like it were an action cam.
Assuming no translation between cam and scene, distances become irrelevant. That leaves just rotation to compensate for. That is less of a headache.
You'll have to track the camera rotation/pose. That topic is called Structure from Motion or SLAM. Optical flow can be a component in that. Usually it involves feature tracking. That can be done via matching, or via sparse optical flow on the feature points.
A good approach would probably combine both (sensor fusion). Optical flow is fairly low-noise but when you accumulate it, it'll drift. The feature matching is noisy but doesn't drift.
When you know how your camera moves, you can then warp the image into the initial (or any other) camera pose. By "camera pose" I mean projection. It's really just a specific orientation of your projection. The math for this is commonly discussed in "image stitching".