Information Technology Reference
In-Depth Information
representation. The branch-and-bound search scheme [5] is adopted to perform fast
hierarchical search for the optimal bounding boxes, leveraging a quality bound for
rectangle sets. We demonstrate that the quality function based on the Gaussianized
vector representation can be written as the sum of contributions from each feature
vector in the bounding box. Moreover, a quality bound can be obtained for any rect-
angle set in the image, with little computational cost, in addition to calculating the
Gaussianized vector representation for the whole image.
To achieve improved robustness to variation in the object class and the back-
ground, we propose incorporating a normalization approach that suppresses the
within-class covariance of the Gaussianized vector representation kernels in the bi-
nary Support Vector Machine (SVM) and the branch-and-bound searching scheme.
We carry out object localization experiments on a multi-scale car dataset. The re-
sults show the proposed object localization approach based on the Gaussianized vec-
tor representation outperforms a similar system using the branch-and-bound search
based on the histogram-of-keywords representation. The normalization approach
further improves the performance of the object localization system. These suggest
that the Gaussianized vector representation can be effective for the localization prob-
lem besides the classification and regression problems reported previously.
The rest of this chapter is arranged as follows. In Section 2, we describe the pro-
cedure of constructing Gaussianized vector representation. Section 3 presents the
normalization approach for robustness to object and background variation. Section
4 details the proposed efficient localization method based on the Gaussianized vec-
tor representation. The experimental results on multi-scale car detection are reported
in Section 5, followed by conclusions and discussion in Section 6. This chapter is
extended from our paper at the 1st International Workshop on Interactive Multime-
dia for Consumer Electronics at ACM Multimedia 2009 [16].
2
Gaussianized Vector Representation
The Gaussian mixture model (GMM) is widely used in various pattern recognition
problems [8, 7]. Recently, the Gaussianized vector representation was proposed.
This representation encodes an image as a bag of feature vectors, the distribution
of which is described by a GMM. Then a GMM supervector is constructed us-
ing the means of the GMM, normalized by the covariance matrices and Gaussian
component priors. A GMM-supervector-based kernel is designed to approximate
Kullback-Leibler divergence between the GMMs for any two images, and is utilized
for supervised discriminative learning using an SVM. Variants of this GMM-based
representation have been successfully applied in several visual recognition tasks,
such as facial age estimation [11, 15], scene categorization [12] and video event
recognition [13].
As pointed out by [12], the success of this representation can be attributed to
two properties. First, it establishes correspondence between feature vectors in dif-
ferent images in an unsupervised fashion. Second, it observes the standard normal
distribution, and is more informative than the conventional histogram of keywords.
 
Search WWH ::




Custom Search