Graphics Reference
In-Depth Information
only retain Harris corners whose quality measure is larger than that of all the points
in their N
×
N pixel neighborhood for some user-selected N .
4.1.1.1 Implementation Details
For a digital image, Harris and Stephens proposed to approximate the gradients in
Equation ( 4.3 ) with
I
I
x (
x , y
) =
I
(
x
+
1, y
)
I
(
x
1, y
)
y (
x , y
) =
I
(
x , y
+
1
)
I
(
x , y
1
)
(4.5)
An alternative that ties in more closely with techniques discussed in the rest of
the chapter is to approximate the gradients by convolving them with the derivatives
of a Gaussian function:
)
(
σ D )
)
(
σ D )
I
G
x , y ,
I
G
x , y ,
x (
x , y
) =
I
(
x , y
y (
x , y
) =
I
(
x , y
(4.6)
x
y
where
indicates convolution, and
2 exp
1
1
x 2
y 2
G
(
x , y ,
σ) =
2 (
+
)
(4.7)
2
πσ
2
σ
That is, we smooth the image to remove high frequencies before taking the deriva-
tive. Also, to make the response as a function of window location smoother, we
can replace the binary function w
in Equation ( 4.3 ) with a radially symmet-
ric function that weights pixels in the center of the window more strongly, such as a
Gaussian:
(
x , y
)
exp
1
1
2
2
w
(
x , y
) =
I ((
x
x 0
)
+ (
y
y 0
)
)
(4.8)
I
2
πσ
2
σ
=
G
(
x
x 0 , y
y 0 ,
σ
)
(4.9)
I
where
is the pixel at the center of the block.
There are two Gaussian functions at work here. The first Gaussian,
(
x 0 , y 0
)
in
Equation ( 4.6 ), uses a derivation scale
D that specifies the domain over which the
image is differentiated to compute the x and y gradients and the amount of smooth-
ing that is applied. The second Gaussian, in Equation ( 4.8 ), uses an integration scale
σ
σ
I that specifies the domain over which the image is integrated to determine which
pixels form the “window.” These two parameters play a major role in scale-space
feature detection, discussed in the next section. For convenience,
σ
I is usually taken
to be a fixed multiple of
σ
D , with the scaling factor in the range
[
1.0, 2.0
]
[ 326 ].
4.1.1.2 Good Features to Track
Shi, Tomasi, and Kanade [ 492 , 442 ] observed that the same matrix in Equation ( 4.3 )
naturally results from investigating the properties of blocks of pixels that are good for
tracking. That is, we consider a sequence of images obtained by a video camera and
indexed by time, I
(
x , y , t
)
. We hypothesize that a given block of pixels at time t
+
1is
actually some block of pixels
W
at time t translated by a vector
(
u , v
)
. That is,
(
+
+
+
) =
(
) (
) W
I
x
u , y
v , t
1
I
x , y , t
x , y
(4.10)
 
Search WWH ::




Custom Search