FormalizingNon-Rigid Structure fromMotion - Deformable Surface 3D Reconstruction from Monocular Images

Image Processing Reference

In-Depth Information

CHAPTER

5

Formalizing Non-Rigid

Structure fromMotion

The template-based methods discussed in Chapter 4 are effective at resolving the ambiguities in-

herent to deformable surface 3D reconstruction from a single input image, given that another image

in which the shape is known can be used as a reference. However, in practice, such a reference may

not always be available and there is a need for methods that can operate without one.

One important approach to overcoming this limitation is to take advantage of the fact

that tracking points over sequences can also be used to resolve ambiguities, without the need

for a reference shape. This has long been known in the context of rigid shape recovery and

exploited by Structure-from-Motion (SFM) algorithms, usually using a variant of the factor-

ization method Tomasi and Kanade [ 1992 ]. Although initially studied in Ullman [ 1983 ], Non-

Rigid Structure-from-Motion (NRSFM) as formulated by most recent methods was introduced

in Bregler et al. [ 2000 ] and has been vigorously pursued since then.

As in the template-based case, we first start by describing the settings under which most

NRSFM methods operate. We then present the most common NRSFM formulations and discuss

their ambiguities.

5.1

PROBLEMDEFINITION

In contrast to template-based reconstruction, NRSFM does not rely on a reference image where

the surface shape is known. Instead, it exploits the availability of multiple images of the object

of interest, generally in the form of a video sequence. Note that these images are not acquired

simultaneously, and, therefore, the shape of the object is different in each image. Given frame-to-

frame 2D correspondences, which can be obtained as discussed in Section 3.2 , NRSFM can be

formulated as the problem of estimating the 3D locations of the individual feature points in each

input image.

In NRSFM, the motion of the camera is explicitly modeled and taken as an additional un-

known of the problem. As a consequence, 3D points need to be expressed in a common world co-

ordinate system. Furthermore, in general, camera internal parameters are not assumed to be known.

In the following analysis, we will consider the same two projection models as in the template-based

case, which we redefine here for the reader's convenience.

Search WWH ::

Custom Search

Home