Dense Correspondence and Its Applications - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

stereo algorithms significantly outperform the current top optical flow algorithms,

when the conditions allow either type of method to be applied. The improvement is

generally attributable to the facts that (1) stereo is an easier problem (since we only

need to search for correspondences along conjugate epipolar lines instead of any-

where in the images) and (2) the discrete nature of the disparity values enables sharp

discontinuities to be distinguished more accurately, and allows powerful discrete

global optimization methods to be applied.

We begin by briefly mentioning several early methods for estimating stereo cor-

respondence, which are generally only locally optimal. These methods have been

superceded by global optimization algorithms based on algorithms like graph cuts

and belief propagation, which consistently rank highly in quantitative benchmarks

for stereo [ 425 ]. For the rest of the section we assume that I 1 and I 2 have already been

rectified.

5.5.1

Early Methods

The earliest stereo algorithms used extremely simple block-matching approaches to

determine the disparity at each pixel. That is, we forma small block around each pixel

(

I 1 and find the disparity d that minimizes some cost function of a correspond-

ing block around

x , y

) ∈

I 2 . For example, we might minimize the sum-of-squared

differences between the blocks (as in Lucas-Kanade optical flow) or the normalized

cross-correlation (see Section 4.2.2 ). In stereo, the sum of absolute differences is

frequently used for computational efficiency, defined by

(

−

d , y

) ∈

C SAD

(

x 0 , y 0 , d

) =

I 2

(

−

d , y

) −

I 1

(

x , y

) |

(5.45)

(

x , y

) ∈ W

where

In place of the absolute difference between each pair of corresponding pixels in

Equation ( 5.45 ), Birchfield and Tomasi [ 46 ] proposed a measure that is insensitive to

effects introduced by image sampling and that has consequently found widespread

use in the stereo community. 17 The Birchfield-Tomasi measure is given by

is a window centered at the pixel of interest

(

x 0 , y 0

)

I max

, I min

C BT

(

x 0 , y 0 , d

) =

min

(

max

(

0, I 1

(

x , y

) −

(

−

d , y

)

(

−

d , y

) −

I 1

(

x , y

))

(

x , y

) ∈ W

I max

, I min

max

(

0, I 2

(

−

d , y

) −

(

x , y

)

(

x , y

) −

I 2

(

−

d , y

)))

(5.46)

where I min

and I max

(

x , y

)

(

x , y

)

are the minimum and maximum of the set

2 (

, 1

I 1 (

−

1, y

) +

I 1 (

x , y

))

, I 1 (

x , y

)

2 (

I 1 (

x , y

) +

I 1 (

1, y

))

(5.47)

I min

and I max

are similarly defined. Effectively, we're linearly interpolating

between pixels in the row of I 1 to determine the best match to a pixel in I 2 and vice

versa.

(

x , y

)

(

x , y

)

17 Birchfield and Tomasi actually proposed to use themeasure only between pairs of pixels (i.e., 1

“windows”), but other researchers have aggregated the measure into larger blocks.

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home