Digital Signal Processing Reference
In-Depth Information
3.2
Local Visual Cues
3.2.1
Filter-Banks and Visual Descriptors
Filter-banks and visual descriptors are used to capture the local appearance of ob-
jects. They are calculated from the neighbor of a pixel. On the one hand, they need
to be discriminative enough to distinguish a large number of object classes, some of
which are visually similar; on the other hand, they need to have invariance to noise,
clutters and changes of illuminations and viewpoints. If they are computed at every
pixel, computational efficiency is another issue to be considered. In this section we
will review some popularly used filter-banks and visual descriptors.
Filter-banks. Filter-banks capture certain frequencies within a neighborhood. Winn
et al. [ 2 ] proposed a set of filter-banks after testing different combinations of Gaus-
sians, Laplacian of Gaussians (LoG), first and second order derivatives of Gaussians
and Gabor kernels on semantic object segmentation. The proposed set of filter-banks
included three Gaussians, four LoGs, and four first-order derivatives of Gaussians.
The three Gaussian kernels with different standard deviation parameters
σ =
1
,
2
,
4
were applied to each CIE L,a,b channel. The four LoGs(with
σ =
1
,
2
,
4
,
8) and the
four first order derivatives of Gaussians (with
8) were applied to L chan-
nel only. The first order derivatives of Gaussians were in x and y directions. See
the kernels of the proposed filter-banks in Fig. 3.3 . Some other filter-banks, such as
rotation-invariant filters and maximum-response filters, were also proposed [ 12 - 14 ].
A comparison study can be found in [ 15 ].
σ =
1
,
2
,
4
,
SIFT. SIFT (Scale-Invariant Feature Transform) (see Fig. 3.4 ) proposed by Lowe
[ 16 ] is the most widely used local visual descriptors. It has reasonable invariance
to changes in illumination, rotation, scaling, and small changes in viewpoints. SIFT
keypoints were detected by finding local extrema of Difference-of-Gaussian (DoG)
Fig. 3.3
4.
They were applied to each CIE L,a,b channel. ( b ) Four derivatives of Gaussians divided into the
x -and y -aligned sets, each with two different values of
A set of filter banks proposed by Winn [ 2 ]. ( a ) Three Gaussian kernels with
σ =
1
,
2
,
σ =
2
,
4. They were applied to L channel.
( c ) Four Laplacian of Gaussians with
σ =
1
,
2
,
4
,
8. They were applied to L channel
Search WWH ::




Custom Search