Dense Correspondence and Its Applications - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

This chapter focuses on the problem of dense correspondence between exactly

two images. We discuss methods using more than two images in Chapter 8 , in the

context of multi-view stereo algorithms for estimating dense 3D depth maps.

5.1

AFFINE AND PROJECTIVE TRANSFORMATIONS

The assumption that two images are related by a simple parametric transformation

is extremely common in computer vision. For example, if we denote a pair of images

and their coordinate systems by I 1

x , y )

(

x , y

)

and I 2

(

, the two are related by an affine

transformation if

x =

y =

a 11 x

a 12 y

b 1

a 21 x

a 22 y

b 2

(5.1)

Affine transformations are useful since they encompass rigid motions

(translations and rotations), similarity transformations (rigidmotions plus a uniform

change in scale) and some shape-changing transformations (such as shears).

However, except under restrictive circumstances, images of the same scene from

different viewpoints are rarely exactly related by affine transformations, although the

approximation may be locally acceptable . 1 A more general parametric model is a

projective transformation or homography , given by:

h 11 x

h 12 y

h 13

h 21 x

h 22 y

h 23

x =

y =

(5.2)

h 31 x

h 32 y

h 33

h 31 x

h 32 y

h 33

Images acquired using perspective projection (see Section 6.2 ) are exactly related

by a projective transformation in two situations: either the camera undergoes pure

rotation (i.e., panning, tilting, and zooming only) or the scene is entirely planar. We

can see that a projective transformation with h 31

0 reduces to

an affine transformation. Also, note that multiplying all the parameters by the same

scalar parameter results in the same transformation. Thus, a projective transforma-

tion has eight degrees of freedom. Figure 5.1 illustrates an image and the results of

various projective transformations.

Estimating an affine or projective transformation relating an image pair typically

begins by detecting and matching features, as discussed in the previous chapter.

The further apart the camera viewpoints, the more likely it is that scale- or affine-

covariant detectors and invariant descriptors will be necessary for generating a large

number of high-quality matches. Let's denote the locations of these matches as

{ (

h 32

0 and h 33

x 1 , y 1 )

x n , y n ) }

in the second

image plane. Due to matching errors and image noise, the matches are unlikely to be

exact, and due to deviations from the ideal assumptions of the parametric transfor-

mation, themodel is unlikely to be exact. Therefore, we search for the parameters that

best fit the correspondences, usually in a robust least-squares sense. This parameter

estimation problem is usually referred to as image registration .

x 1 , y 1

)

...

(

x n , y n

) }

in the first image plane and

{ (

...

(

1 For an affine transformation to exactly represent the motion of all pixels in images acquired using

perspective projection (see Section 6.2 ), the image planes must both be parallel to each other and

to the direction of camera motion. Furthermore, if the translational motion is nonzero, the scene

must be a planar surface parallel to the image planes. Nonetheless, the affine assumption is often

made when the scene is far from the camera and the rotation between viewpoints is very small.

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home