Information Technology Reference
In-Depth Information
applications [38, 39]. Since the descriptors are heterogenous (SMPs, low-frequency
patches, and motion patches), several such divergences will be combined.
Let us assume that the two video segments to be compared are both composed of
a single GOP. One of them will be referred to as the query and denoted by
G
Q
,the
other one as the reference,
G
R
. The dissimilarity
D
between
G
Q
and
G
R
is defined
as
D
(
G
Q
,
G
R
)=
s
D
s
(
G
Q
,
G
R
)
spatial term
t
D
t
(
G
Q
,
G
R
)
temporal term
α
+
α
(21)
where
D
KL
p
k
(
G
Q
)
p
k
(
G
R
)
⎧
⎨
∑
D
s
(
G
Q
,
G
R
)=
||
0
≤
k
≤
K
−
1
.
(22)
D
t
(
G
Q
,
G
R
)=
D
KL
p
m
(
G
Q
)
p
m
(
G
R
)
⎩
||
The positive parameters
α
t
allow us to tune the relative influence of the
spatial and temporal terms. The scale
k
= 0 is the coarsest scale of the decomposition
corresponding to the low-pass subband.
p
k
(
G
), respectively
p
m
(
G
), denotes the PDF
underlying the SMPs or low-frequency patches
α
s
and
W
k
,
p
,
p
{
}
, respectively the motion
{
m
p
,
p
}
patches
, extracted from the GOP
G
. Finally,
D
KL
denotes the Kullback-
Leibler divergence.
The term of scale
k
in the sum
D
s
can be interpreted as a measure of how dissim-
ilar local spatial structures are at this scale in the respective key frames of
G
Q
and
G
R
. Overall,
D
s
indicates whether some objects are present in both frames. Since the
motion patches group together motion vectors and their respective location, the tem-
poral term
D
t
not only tells about how the motions throughout the GOPs compare;
it also tells (roughly) whether similar shapes move the same way in both GOPs.
Let us now see how the Kullback-Leibler divergences involved in the definition
of
D
can be conveniently estimated from a set of realizations.
3.2.2
Estimation in the kNN Framework
Estimation of the Kullback-Leibler divergence between two PDFs
p
and
q
when
these PDFs are
known
only through two respective sets of realizations
U
and
V
ap-
parently requires prior estimation of the PDFs. Because the realizations are vectors
of high-dimension (18 for high-pass and band-pass SMPs, 27 for low-pass patches,
and 16 for motion patches), PDF estimation is afflicted with the curse of dimen-
sionality [34]. Assuming that an accurate parametric model of the PDFs can be
built anyway (for example, a mixture of Gaussians), an analytic expression of the
divergence in terms of the model parameters exists only for some restricted cases
such as mixtures composed of a unique Gaussian. Alternatively, an entropy esti-
mator written directly in terms of the realizations has been proposed in the kNN
framework [35, 36, 37]. Then, the Kullback-Leibler divergence being the difference
between a cross-entropy and an entropy, the divergences
D
KL
involved in (22) can
be expressed as functions of the sets of patches
W
k
,
p
,
p
{
}
, for each scale
k
,and
{
m
p
,
p
}
.