Graphics Reference
In-Depth Information
This chapter focuses on the problem of dense correspondence between exactly
two images. We discuss methods using more than two images in Chapter 8 , in the
context of multi-view stereo algorithms for estimating dense 3D depth maps.
5.1
AFFINE AND PROJECTIVE TRANSFORMATIONS
The assumption that two images are related by a simple parametric transformation
is extremely common in computer vision. For example, if we denote a pair of images
and their coordinate systems by I 1
x , y )
(
x , y
)
and I 2
(
, the two are related by an affine
transformation if
x =
y =
a 11 x
+
a 12 y
+
b 1
a 21 x
+
a 22 y
+
b 2
(5.1)
Affine transformations are useful since they encompass rigid motions
(translations and rotations), similarity transformations (rigidmotions plus a uniform
change in scale) and some shape-changing transformations (such as shears).
However, except under restrictive circumstances, images of the same scene from
different viewpoints are rarely exactly related by affine transformations, although the
approximation may be locally acceptable . 1 A more general parametric model is a
projective transformation or homography , given by:
h 11 x
+
h 12 y
+
h 13
h 21 x
+
h 22 y
+
h 23
x =
y =
(5.2)
h 31 x
+
h 32 y
+
h 33
h 31 x
+
h 32 y
+
h 33
Images acquired using perspective projection (see Section 6.2 ) are exactly related
by a projective transformation in two situations: either the camera undergoes pure
rotation (i.e., panning, tilting, and zooming only) or the scene is entirely planar. We
can see that a projective transformation with h 31
0 reduces to
an affine transformation. Also, note that multiplying all the parameters by the same
scalar parameter results in the same transformation. Thus, a projective transforma-
tion has eight degrees of freedom. Figure 5.1 illustrates an image and the results of
various projective transformations.
Estimating an affine or projective transformation relating an image pair typically
begins by detecting and matching features, as discussed in the previous chapter.
The further apart the camera viewpoints, the more likely it is that scale- or affine-
covariant detectors and invariant descriptors will be necessary for generating a large
number of high-quality matches. Let's denote the locations of these matches as
{ (
=
h 32
=
0 and h 33
=
x 1 , y 1 )
x n , y n ) }
in the second
image plane. Due to matching errors and image noise, the matches are unlikely to be
exact, and due to deviations from the ideal assumptions of the parametric transfor-
mation, themodel is unlikely to be exact. Therefore, we search for the parameters that
best fit the correspondences, usually in a robust least-squares sense. This parameter
estimation problem is usually referred to as image registration .
x 1 , y 1
)
,
...
,
(
x n , y n
) }
in the first image plane and
{ (
,
...
,
(
1 For an affine transformation to exactly represent the motion of all pixels in images acquired using
perspective projection (see Section 6.2 ), the image planes must both be parallel to each other and
to the direction of camera motion. Furthermore, if the translational motion is nonzero, the scene
must be a planar surface parallel to the image planes. Nonetheless, the affine assumption is often
made when the scene is far from the camera and the rotation between viewpoints is very small.
 
Search WWH ::




Custom Search