Features and Matching - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

If we fix a block of pixels at time t and want to find its translation at time t

1, we

therefore minimize the cost function

(

) =

(

)(

(

) −

(

))

u , v

x , y

u , y

v , t

x , y , t

(4.11)

(

x , y

)

where w

and 0 otherwise. Using the same type of Taylor

series approximation as we did in Equation ( 4.2 ) and setting the derivative equal to

zero yields the linear system

(

x , y

)

is 1 for pixels inside

∂ I

∂

∂ I

∂

( x , y )

) ∂ I

∂

(

x , y

)

x (

x , y , t

)

(

x , y

)

x (

x , y , t

y (

x , y , t

)

∂ I

∂

∂ I

∂

( x , y )

) ∂ I

∂

(

x , y

)

(

x , y , t

(

x , y , t

)

(

x , y

)

(

x , y , t

)

∂ I

∂

( x , y )

) ∂ I

∂

(

x , y

)

x (

x , y , t

t (

x , y , t

)

∂ I

∂

=−

(4.12)

( x , y )

) ∂ I

∂

(

x , y

)

(

x , y , t

(

x , y , t

)

We can see that the square matrix in Equation ( 4.12 ) is exactly the Harris matrix

H of Equation ( 4.3 ). Shi, Tomasi, and Kanade argued that for the linear system to be

well conditioned— that is, for the feature to be reliably trackable—both eigenvalues

of H should be sufficiently large, suggesting the criterion

min

(λ

1 ,

)>τ

(4.13)

where

is a user-defined threshold. Features discovered in this way are quite similar

to Harris corners, and are sometimes called KLT corners since they form the basis

for the well-known KLT (Kanade-Lucas-Tomasi) tracker [ 307 , 492 ].

Shi and Tomasi extended their model for the motion of a feature from a trans-

lation to an affine transformation, to account for the deformation of features that

typically occur over long sequences (see also Section 4.1.5 ). That is, the scene patch

corresponding to a square feature block in the first image will eventually project

to a non-square area as the camera and scene objects move, so Equation ( 4.10 )is

modified to

(

u , cx

v , t

) =

(

x , y , t

) ∀ (

x , y

) ∈ W

(4.14)

where the parameters a , b , c , d allow the feature square to deform into a paral-

lelogram. The corresponding tracker is again obtained using a Taylor expansion.

We will discuss more advanced methods for affine-invariant feature detection in

Section 4.1.5 .

When the feature dissimilarity (e.g., the error in Equation ( 4.11 )) gets too large,

the feature is no longer reliable and should not be tracked. When many features

are simultaneously matched or tracked, outlier rejection techniques can be used to

dispose of bad features [ 494 ], and the underlying epipolar geometry provides a strong

constraint on where the matches can occur [ 576 ]. We will discuss the latter issue

further in Chapter 5 . Wu et al. [ 554 ] noted that the KLT tracker could be improved by

processing frames both forward and backward in time, instead of always matching

the current frame to the previous one.

Jin et al. [ 222 ] extended Shi and Tomasi's affine tracker to account for local photo-

metric changes in the image — that is, instead of assuming that the pixel intensities

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home