Efficient Object Localization with Variation-Normalized Gaussianized Vectors - Intelligent Video Event Analysis and Understanding

Information Technology Reference

In-Depth Information

Fig. 1 Sample images in the multi-scale car dataset

5.2

Metric

The localization performance is measured by recall and precision, the same way as

in [1] and [5]. A hypothesized bounding box is counted as a correct detection if its

location coordinates and size lie within an ellipsoid centered at the true coordinates

and size. The axes of the ellipsoid are 25% of the true object dimensions in each

direction. For multiple detected bounding boxes satisfying the above criteria for

the same object, only one is counted as correct and the others are counted as false

detections.

5.3

Gaussianized Vectors

The feature vectors for each image are extracted as follows. First, square patches

randomly sized between 4

12 are extracted on a dense pixel grid. Sec-

ond, an 128-dimensional SIFT vector is extracted from each of these square patches.

Third, each SIFT vector is reduced to 64 dimensions by Principal Component Anal-

ysis. Therefore, each image is converted to a set of 64-dimensional feature vectors.

These feature vectors are further transformed into Gaussianized vector represen-

tations as described in Section 2. Each image is therefore represented as a Gaussian-

ized vector. In particular, we carry out the experiment with 32, 64, 128 Gaussian

components in the GMMs respectively.

×

4 and 12

×

Search WWH ::

Custom Search

Home