Information Technology Reference
In-Depth Information
Fig. 1 Sample images in the multi-scale car dataset
5.2
Metric
The localization performance is measured by recall and precision, the same way as
in [1] and [5]. A hypothesized bounding box is counted as a correct detection if its
location coordinates and size lie within an ellipsoid centered at the true coordinates
and size. The axes of the ellipsoid are 25% of the true object dimensions in each
direction. For multiple detected bounding boxes satisfying the above criteria for
the same object, only one is counted as correct and the others are counted as false
detections.
5.3
Gaussianized Vectors
The feature vectors for each image are extracted as follows. First, square patches
randomly sized between 4
12 are extracted on a dense pixel grid. Sec-
ond, an 128-dimensional SIFT vector is extracted from each of these square patches.
Third, each SIFT vector is reduced to 64 dimensions by Principal Component Anal-
ysis. Therefore, each image is converted to a set of 64-dimensional feature vectors.
These feature vectors are further transformed into Gaussianized vector represen-
tations as described in Section 2. Each image is therefore represented as a Gaussian-
ized vector. In particular, we carry out the experiment with 32, 64, 128 Gaussian
components in the GMMs respectively.
×
4 and 12
×
 
Search WWH ::




Custom Search