Deformations arise in a variety of scenes, from medical images to animals in motion to cloth blowing in the wind. We introduce novel topological image processing algorithms specifically designed to consider deforming scenes to address two fundamental image processing problems: image matching and occlusion detection.
Local photometric descriptors are a crucial low level component of numerous computer vision algorithms. In practice, these descriptors are constructed to be invariant to a class of transformations. However, the development of a descriptor that is simultaneously robust to noise and invariant under general deformation has proven difficult. We introduce the Topological-Attributed Relational Graph (T-ARG), a new local photometric descriptor constructed from homology that is provably invariant to locally bounded deformation. This new robust topological descriptor is backed by a formal mathematical framework. We apply T-ARG to a set of benchmark images to evaluate its performance. Results indicate that T-ARG significantly outperforms traditional descriptors for noisy, deforming images.
Occlusions provide critical cues about the 3D structure of man-made and natural scenes. We introduce a mathematical framework and algorithm to detect and localize occlusions in image sequences of scenes that include deforming objects. Our occlusion detector works under far weaker assumptions than other detectors. We prove that occlusions in deforming scenes occur when certain well-defined local topological invariants are not preserved. Our framework employs these invariants to detect occlusions with a zero false positive rate under assumptions of bounded deformations and color variation. The novelty and strength of this methodology is that it does not rely on spatio-temporal derivatives or matching, which can be problematic in scenes including deforming objects, but is instead based on a mathematical representation of the underlying cause of occlusions in a deforming 3D scene. We demonstrate the effectiveness of the occlusion detector using image sequences of natural scenes, including deforming cloth and hand motions.