Digital Signal Processing Reference
In-Depth Information
Fig. 3.4 SIFT descriptor [ 16 ]
is computed by combining
the normalized orientation
histograms of gradients
within subregions of the
keypoint into a feature vector
filters at different scales. For each keypoint, its orientation and scale were selected.
A SIFT descriptor of a keypoint was obtained by first computing the gradient
magnitudes and orientations of pixels in the neighborhood region of the keypoint,
using the scale of the keypoint to select a proper Gaussian kernel to blur the im-
age. In order to achieve orientation invariance, the coordinates of the descriptor and
the gradient orientations were rotated relative to the keypoint orientation. The ori-
entation histograms within the subregions around the keypoint were computed and
combined into the SIFT feature vector. This vector was normalized to improve the
invariance to changes of illumination. Gradient Location and Orientation Histogram
(GLOH) [ 17 ] extended SIFT by allowing SIFT descriptor to be computed on a log-
polar location grid.
HOG. Histogram of Oriented Gradients proposed by Dalal and Triggs [ 18 ]was
similar to SIFT. It computed the histograms of gradient orientations in different
subregions. Different from SIFT, which was computed on detected sparse key-
points, HOG was sampled from a dense and uniform grid and was improved by
local contrast normalization in overlapping spatial blocks. Integral Histogram of
Oriented Gradients (IHO) [ 19 ] is an approximation of HOG and can be efficiently
computed using integral images.
MSER. Instead of detecting keypoints, Maximally Stable Extremal Regions
(MSER) proposed by Matas et al. [ 20 ] detected regions which were darker or
brighter than surroundings. It was affinely-invariant and robust to changes of illu-
minations. It was extended to colour in [ 21 ].
SURF. Bay et al. [ 22 ] proposed the SURF (Speeded Up Robust Features) descrip-
tor, which could be efficiently computed using integral images. The neighborhood
of a pixel was uniformly adapted into P
Q spatial bins. The SURF descriptor was
calculated by accumulating the sum of Haar wavelet responses at different spatial
bins. Let d x and d y be the Haar wavelet responses in the horizontal and vertical di-
rections. The descriptor has a four-dimensional vector
×
d y )
(
d x ,
d x ,
d y ,
×
×
for each spatial bin. The resulting 4
P
Q dimensions SURF descriptor was L1-
normalized.
Search WWH ::




Custom Search