Graphics Reference
In-Depth Information
static objects when varying illuminations. The algorithm proposed in Zhang et al. (2003) was
tested on moving objects (faces when conveying arbitrary expressions). The following synthe-
sis is based on both works, but the reconstruction results are taken from Zhang et al. (2004)
because the object of interest in this chapter is human face. We note that in their experiments,
Zhang et al. (2004) used four cameras and two projectors. Each side of the face was acquired
by one binocular active stereo system (one projector associated to two cameras). By this way,
the authors tried to avoid self-occlusions, which can be a challenging problem in stereo vision
(even if a textured light were projected).
Spatial stereo matching. The way in which traditional stereo systems determine the position
in space of P , is triangulation, that is by intersection the rays defined by the centers c l ,
c r of
cameras C l ,
t ),
respectively. Thus triangulation accuracy depends crucially on the solution of corresponding
problem. This kind of approaches, widely used in literature, operates entirely within the
spatial domain (the images). In fact, knowing the cameras positions (( R
C r and the projection of P in left and right images I l ( x l ,
y l ,
t ) and I r ( x r ,
y r ,
t ), the stereo
extrinsic parameters), one can first apply rectification transformation that projects left image
I l ( x l ,
,
,
t ) and right image I r ( x r ,
,
t ) onto a common image plane, where y l =
y r =
y .
Thus, the establishing correspondence moves from a 2D search problem to a 1D search
problem and minimizes the matching 1D function F ( x r ) 1.35, to find x r ,
y
y
I r ( V s ( x r ))) 2
F ( x r )
=
( I l ( V s ( x l ))
,
(1.35)
V s
where V s is a window of pixels in a spatial neighborhood close to x l (or x r ). The size of
V s is a parameter, it is well-known that the smoothness/noisy reconstruction depends on
larger/smaller used window V s . F ( x r ) given in Equation 1.35 is simply the square difference
metric. Other metrics exist is the literature, we refer the reader to the review presented in
Scharstein and Szeliski (2002). Figure 1.15 c shows the reconstructed facial surface from
passive stereo (left top frame is given Fig. 1.15 a ). Here, neither light pattern is projected
on the face. The reconstruction result is noisy due to the texture homogeneity on the skin
regions, which leads to matching ambiguities. In contract, an improved reconstruction is
V st ( x l , t 0 )
V st ( x r , t 0 )
y l = y r = y
y l = y r = y
V s ( x l )
V s ( x r )
Time
Time
x l
x r
I r
I l
Figure 1.14 Spatial vs. Spacetime stereo matching. The spatial matching uses only spatial axis along
y , thus the V s window to establish correspondence. The spacetime stereo matching extend the spatial
window to the time axis, thus the V st is used to compute F ( x r )
Search WWH ::




Custom Search