Graphics Reference
In-Depth Information
estimate the epipolar geometry and reject feature matches that pass the above tests
but are inconsistent with the epipolar lines (e.g., [ 576 ]). We discuss epipolar geometry
in detail in Chapter 5 .
4.2.3
Histogram-Based Descriptors
Several of the most commonly-used descriptors can be characterized as histograms
of pixel intensities or gradient valuesmeasuredover a set of subregions superimposed
on the feature's support region. In particular, the SIFT descriptor proposed by Lowe
[ 306 ] has proven extremely popular. The input to the detector is a feature location
(
. We first create
an oriented square centered at the feature location so that the top edge corresponds
to the dominant orientation (Figure 4.17 a). Each side of the square is a multiple of
the feature's scale (e.g., 6
x , y
)
, its estimated scale
σ
, and its estimated dominant orientation
θ
).
We then rotate the square, resampling the image pixels and smoothing the intensi-
ties with the appropriate Gaussian for the feature's characteristic scale. The gradient
at each resampled pixel is estimated and weighted by a Gaussian centered at the
middle of the square with a standard deviation of half the square's width; the goal is
to emphasize gradients closer to the center of the square.
Next, we subdivide the square into a 4
σ
4 grid of smaller squares and create a
coarsely-quantized histogram of eight gradient orientations within each grid square
(Figure 4.17 b). To make the descriptor more robust to small misalignments, the gra-
dient at a given pixel contributes to multiple grid squares and multiple histogram
bins based on trilinear interpolation (see Problem 4.19). We collect the eight his-
togram values from each of the sixteen grid squares into a 128-dimensional vector.
The final descriptor is obtained by normalizing this vector to unit length, zeroing
out any extremely large values (e.g., greater than 0.2), and renormalizing to unit
×
(a)
(b)
Figure 4.17. Constructing the SIFT descriptor. (a) An original detected feature, with charac-
teristic scale and dominant orientation. The descriptor is computed from the pixels inside the
indicated square, where the small arrow indicates the top edge. (b) The rotated and resampled
square of pixels, with the 4 × 4 grid overlaid. The eight lines inside each square indicate the size
of the histogram bin for each corresponding orientation. The SIFT descriptor is the concatenation
of these 8 × 4 × 4 = 128 gradient magnitudes into a vector, which is then normalized.
Search WWH ::




Custom Search