Information Technology Reference
In-Depth Information
normalization approach, which depresses the kernel components with high-variation
within each class. This method was first proposed in the speaker recognition prob-
lem [4], and we have successfully applied it in video categorization [13].
We assume the Gaussianized vector representation kernels in Equation 8 are char-
acterized by a subspace spanned by the projection matrix V all . The desired normal-
ization suppresses the subspace, V , that has the maximum inter-image distance d V
for images (or image regions) of either the objects or the backgrounds:
d ab
V
V T
V T
2
=
φ (
Z a )
φ (
Z b )
.
(10)
Since V identifies the subspace in which feature similarity and label similarity are
most out of sync, this subspace can be suppressed by calculating the kernel func-
tion as in Equation 11, where C is a diagonal matrix, indicating the extent of such
asynchrony for each dimension in the subspace.
T
VCV T
k
(
Z a ,
Z b )= φ (
Z a )
(
I
) φ (
Z b ) .
(11)
We can find the subspace V by solving the following,
a = b d a V W ab ,
V
=
arg max
V T V
(12)
=
I
where W ab =1 when Z a and Z b both belong to the object class or the background
class, otherwise W ab =
0.
Denote Z
,where N is the total number of training
images, it can be shown that the optimal V consists of the eigenvectors correspond-
ing to the largest eigenvalues
=[ φ (
Z 1
) , φ (
Z 2
) , ··· , φ (
Z N
)]
of the matrix Z
Z T ,where D is a diagonal
Λ
(
D
W
)
N
j =
matrix with D ii =
1 W ij ,∀
i .
indicate the extent to which the corresponding dimensions
vary within the same class. In order to ensure the diagonal elements of C remain in
the range of
The eigenvalues
Λ
, Λ ) 1 .
[
0
,
1
]
, we apply a monotonic mapping C
=
1
max
(
I
4
Localization with Gaussianized Vector Representation
In this section, we first present the efficient search scheme based on branch-and-
bound in Subsection 4.1. Then we detail the quality function and qualify bound for
the Gaussianized vector representation in Subsections 4.2 and 4.3 respectively. In
Subsection 4.4, we present incorporating the variation-normalization approach in
the localization framework.
4.1
Branch-and-Bound Search
Localization of an object is essentially to find the subarea in the image on which
a quality function f achieves its maximum, over all possible subareas. One way
to define these subareas is the bounding box, which encodes the location, width
 
Search WWH ::




Custom Search