Features and Matching - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

only retain Harris corners whose quality measure is larger than that of all the points

in their N

N pixel neighborhood for some user-selected N .

4.1.1.1 Implementation Details

For a digital image, Harris and Stephens proposed to approximate the gradients in

Equation ( 4.3 ) with

∂

x (

x , y

) =

(

1, y

) −

(

−

1, y

)

y (

x , y

) =

(

x , y

) −

(

x , y

−

)

(4.5)

∂

An alternative that ties in more closely with techniques discussed in the rest of

the chapter is to approximate the gradients by convolving them with the derivatives

of a Gaussian function:

∂

) ∗ ∂

(

σ D )

∂

) ∗ ∂

(

σ D )

x , y ,

x (

x , y

) =

(

x , y

y (

x , y

) =

(

x , y

(4.6)

∂

where

∗

indicates convolution, and

2 exp

x 2

y 2

(

x , y ,

σ) =

−

2 (

)

(4.7)

πσ

That is, we smooth the image to remove high frequencies before taking the deriva-

tive. Also, to make the response as a function of window location smoother, we

can replace the binary function w

in Equation ( 4.3 ) with a radially symmet-

ric function that weights pixels in the center of the window more strongly, such as a

Gaussian:

(

x , y

)

exp

(

x , y

) =

−

I ((

−

x 0

)

+ (

−

y 0

)

(4.8)

πσ

(

−

x 0 , y

−

y 0 ,

)

(4.9)

where

is the pixel at the center of the block.

There are two Gaussian functions at work here. The first Gaussian,

(

x 0 , y 0

)

Equation ( 4.6 ), uses a derivation scale

D that specifies the domain over which the

image is differentiated to compute the x and y gradients and the amount of smooth-

ing that is applied. The second Gaussian, in Equation ( 4.8 ), uses an integration scale

I that specifies the domain over which the image is integrated to determine which

pixels form the “window.” These two parameters play a major role in scale-space

feature detection, discussed in the next section. For convenience,

I is usually taken

to be a fixed multiple of

D , with the scaling factor in the range

[

1.0, 2.0

]

[ 326 ].

4.1.1.2 Good Features to Track

Shi, Tomasi, and Kanade [ 492 , 442 ] observed that the same matrix in Equation ( 4.3 )

naturally results from investigating the properties of blocks of pixels that are good for

tracking. That is, we consider a sequence of images obtained by a video camera and

indexed by time, I

(

x , y , t

)

. We hypothesize that a given block of pixels at time t

1is

actually some block of pixels

at time t translated by a vector

(

u , v

)

. That is,

(

) =

(

) ∀ (

) ∈ W

u , y

v , t

x , y , t

x , y

(4.10)

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home