Image Processing Reference
Unfortunately, while in the weak perspective case W was known, here it depends on the
unknown perspective depth scalars d i . As a consequence, the solution cannot be directly estimated
by a simple singular value decomposition. To overcome this difficulty, several solutions have been
proposed. In Xiao and Kanade [ 2005 ], an iterative procedure was introduced to alternatively compute
the structure and motion from fixed depths, and vice-versa. Initially, the depths d i were set to 1.
In Llado et al. [ 2010 ], some parts of the surface were assumed to move rigidly. Therefore, an initial
solution was computed using the results obtained with a rigid structure from motion techniques
on these parts and refined using a nonlinear optimization method. Recently, in Hartley and Vidal
[ 2008 ], it was shown that the solution to perspective NRSFM could be obtained in closed-form
by exploiting the tensor estimation and factorization method of Hartley and Schaffalitzky [ 2004 ].
While this gives an exact solution in the noise-free case, the approach is sensitive to noise. As
observed in Hartley and Vidal [ 2008 ], this is mainly due to the fact that the tensor estimation and
factorization method they relied on Hartley and Schaffalitzky [ 2004 ] lacks robustness to noise, as
many purely algebraic methods do.
5.4 AMBIGUITIES OF NRSFM
Even though, in many NRSFM methods, the shape is already regularized by a linear subspace model,
ambiguities remain. This makes sense, since the shape basis also is an unknown of the problem. Fur-
thermore, while for template-based reconstruction going from weak to full perspective theoretically
yields a better-posed problem, perspective NRSFM still suffers from the same ambiguities as the
weak persective formulation.
First, the decomposition of W into C and B can only be computed up to an invertible
transformation. Indeed, for any invertible 3 N s ×
3 N s
matrix G , we can write
= CGG − 1
This was also observed for the rigid structure-from-motion problem in the factorization method
of Tomasi and Kanade [ 1992 ]. This matrix G is known as the corrective transformation. Since, in
theory, any G would do, a way must be found to choose the best one. Typically, this is done by finding
a G that ensures that the rotation matrices are orthonormal. Details on the different manners to
exploit this will be given in Chapter 6 .In Xiao and Kanade [ 2004 ], Xiao et al. [ 2004b ], it was argued
that, even when enforcing orthonormality constraints, ambiguities remained in the reconstruction.
However, it was later shown in Akhter et al. [ 2009 ] that all solutions in this ambiguous space yield
equal structures up to a 3D rotation.
In addition to the corrective transformation, other ambiguities inherent to NRSFM were
discussed in Aanaes and Kahl [ 2002 ]. One of them is the relative translation and scale between
the camera center and the object. As in the template-based case of Chapter 3 , it is impossible to
differentiate between a fixed camera seeing an expanding object and a camera moving closer to a
constant-size object. This, in general, is overcome either by fixing the object scale, or by imposing