Graphics Reference
In-Depth Information
If we fix a block of pixels at time t and want to find its translation at time t
+
1, we
therefore minimize the cost function
2
(
) =
(
)(
(
+
+
+
)
(
))
F
u , v
w
x , y
I
x
u , y
v , t
1
I
x , y , t
(4.11)
(
x , y
)
where w
and 0 otherwise. Using the same type of Taylor
series approximation as we did in Equation ( 4.2 ) and setting the derivative equal to
zero yields the linear system
(
x , y
)
is 1 for pixels inside
W
I
2
I
u
v
( x , y )
( x , y )
) I
w
(
x , y
)
x (
x , y , t
)
w
(
x , y
)
x (
x , y , t
y (
x , y , t
)
I
I
2
( x , y )
( x , y )
) I
w
(
x , y
)
(
x , y , t
(
x , y , t
)
w
(
x , y
)
(
x , y , t
)
x
y
y
I
( x , y )
) I
w
(
x , y
)
x (
x , y , t
t (
x , y , t
)
I
=−
(4.12)
( x , y )
) I
w
(
x , y
)
(
x , y , t
(
x , y , t
)
y
t
We can see that the square matrix in Equation ( 4.12 ) is exactly the Harris matrix
H of Equation ( 4.3 ). Shi, Tomasi, and Kanade argued that for the linear system to be
well conditioned— that is, for the feature to be reliably trackable—both eigenvalues
of H should be sufficiently large, suggesting the criterion
min
1 ,
λ
)>τ
(4.13)
2
where
is a user-defined threshold. Features discovered in this way are quite similar
to Harris corners, and are sometimes called KLT corners since they form the basis
for the well-known KLT (Kanade-Lucas-Tomasi) tracker [ 307 , 492 ].
Shi and Tomasi extended their model for the motion of a feature from a trans-
lation to an affine transformation, to account for the deformation of features that
typically occur over long sequences (see also Section 4.1.5 ). That is, the scene patch
corresponding to a square feature block in the first image will eventually project
to a non-square area as the camera and scene objects move, so Equation ( 4.10 )is
modified to
τ
I
(
ax
+
by
+
u , cx
+
dy
+
v , t
+
1
) =
I
(
x , y , t
) (
x , y
) W
(4.14)
where the parameters a , b , c , d allow the feature square to deform into a paral-
lelogram. The corresponding tracker is again obtained using a Taylor expansion.
We will discuss more advanced methods for affine-invariant feature detection in
Section 4.1.5 .
When the feature dissimilarity (e.g., the error in Equation ( 4.11 )) gets too large,
the feature is no longer reliable and should not be tracked. When many features
are simultaneously matched or tracked, outlier rejection techniques can be used to
dispose of bad features [ 494 ], and the underlying epipolar geometry provides a strong
constraint on where the matches can occur [ 576 ]. We will discuss the latter issue
further in Chapter 5 . Wu et al. [ 554 ] noted that the KLT tracker could be improved by
processing frames both forward and backward in time, instead of always matching
the current frame to the previous one.
Jin et al. [ 222 ] extended Shi and Tomasi's affine tracker to account for local photo-
metric changes in the image — that is, instead of assuming that the pixel intensities
Search WWH ::




Custom Search