Tracking the movement of a camera and reconstruction the observed scene at the same time — this is a classical flagship task of computer vision. While traditional methods can yield great results, they often fail in hard cases because they are not robust to small mistakes or intermittent failures. On the other hand, recent works on learned methods often overfit to specialized situations and fail when applied in more general settings. In this work, we combine the tracking and reconstruction principles from traditional methods with the flexibility and strong robustness of deep learning. The results compare favorably against both classic and learning-powered methods.
More info at this website.