Image Processing Reference
Figure 6.16: Reconstruction of a piece of paper being torn apart Taylor et al. [ 2010 ]. Courtesy of A.
Jepson. © 2010 IEEE.
In Fayad et al. [ 2010 ], the authors extended this approach to allow for more complex deformations
of the individual patches, given 2D points tracked across a video sequence. Instead of imposing
local planarity, local deformation are regularized using the quadratic deformation model discussed
in Section 6.3.1 Fayad et al. [ 2009 ]. Each overlapping group of tracked 2D points is reconstructed
independently using the quadratic deformation model of Eq. 6.19 . As in the global case, this requires
a rest shape, which can be obtained from the first few initial frames of the sequence using a rigid
structure-from-motion technique. As in Va ro l et al. [ 2009 ], the scale ambiguity of each patch, is
resolved by exploiting the overlap between the patches. The top two rows of Fig. 6.15 depict results
obtained on real images of a piece of paper undergoing large deformations. The bottom row features
a comparison of this approach with Va ro l et al. [ 2009 ]. Note that the reconstructions of Fayad et al.
[ 2009 ] are more accurate than those of Va ro l et al. [ 2009 ]. This seems reasonable both because the
method of Fayad et al. [ 2009 ] exploits the whole sequence instead of just two images, and because
allowing the patches to deform is a better approximation of the observed phenomenon.
The two methods described above implicitly assume some amount of smoothness in the local
deformations. By contrast, the method of Taylor et al. [ 2010 ] relies exclusively on the preservation
of local Euclidean distances between feature points found on the surface, much as the Ecker et al.
[ 2008 ], Perriollat et al. [ 2010 ] methods introduced in Section 18.104.22.168 . Note, however, that in the
NRSFM framework, the true distances between pairs of points are unknown. To overcome this
problem, triplets of neighboring points that move rigidly are identified and the global shape re-
constructed as a soup of triangles whose vertices remain at a fixed distance from each other. More
specifically, under orthographic projection, the 3D length of an edge between points q i 1
and q i 2
related to the length of its projection in the image plane by
2 = (d i 1 − d i 2 ) 2 ,
q i 1 −
q i 2
p i 1 −
p i 2
where p i is the 2D projection of point i , and d i is its depth. Furthermore, the sum of pairwise depth
differences within a single triangle is always equal to zero, which can be written as
(d 2 − d 1 ) + (d 3 − d 2 ) + (d 1 − d 3 ) =