Image Processing Reference
In-Depth Information
CHAPTER
5
Formalizing Non-Rigid
Structure fromMotion
The template-based methods discussed in Chapter 4 are effective at resolving the ambiguities in-
herent to deformable surface 3D reconstruction from a single input image, given that another image
in which the shape is known can be used as a reference. However, in practice, such a reference may
not always be available and there is a need for methods that can operate without one.
One important approach to overcoming this limitation is to take advantage of the fact
that tracking points over sequences can also be used to resolve ambiguities, without the need
for a reference shape. This has long been known in the context of rigid shape recovery and
exploited by Structure-from-Motion (SFM) algorithms, usually using a variant of the factor-
ization method Tomasi and Kanade [ 1992 ]. Although initially studied in Ullman [ 1983 ], Non-
Rigid Structure-from-Motion (NRSFM) as formulated by most recent methods was introduced
in Bregler et al. [ 2000 ] and has been vigorously pursued since then.
As in the template-based case, we first start by describing the settings under which most
NRSFM methods operate. We then present the most common NRSFM formulations and discuss
their ambiguities.
5.1
PROBLEMDEFINITION
In contrast to template-based reconstruction, NRSFM does not rely on a reference image where
the surface shape is known. Instead, it exploits the availability of multiple images of the object
of interest, generally in the form of a video sequence. Note that these images are not acquired
simultaneously, and, therefore, the shape of the object is different in each image. Given frame-to-
frame 2D correspondences, which can be obtained as discussed in Section 3.2 , NRSFM can be
formulated as the problem of estimating the 3D locations of the individual feature points in each
input image.
In NRSFM, the motion of the camera is explicitly modeled and taken as an additional un-
known of the problem. As a consequence, 3D points need to be expressed in a common world co-
ordinate system. Furthermore, in general, camera internal parameters are not assumed to be known.
In the following analysis, we will consider the same two projection models as in the template-based
case, which we redefine here for the reader's convenience.
Search WWH ::




Custom Search