Information Technology Reference
In-Depth Information
applications [38, 39]. Since the descriptors are heterogenous (SMPs, low-frequency
patches, and motion patches), several such divergences will be combined.
Let us assume that the two video segments to be compared are both composed of
a single GOP. One of them will be referred to as the query and denoted by G Q ,the
other one as the reference, G R . The dissimilarity D between G Q and G R is defined
as
D ( G Q , G R )=
s D s ( G Q , G R )
spatial term
t D t ( G Q , G R )
temporal term
α
+
α
(21)
where
D KL p k ( G Q )
p k ( G R )
D s ( G Q , G R )=
||
0
k
K
1
.
(22)
D t ( G Q , G R )= D KL p m ( G Q )
p m ( G R )
||
The positive parameters
α t allow us to tune the relative influence of the
spatial and temporal terms. The scale k = 0 is the coarsest scale of the decomposition
corresponding to the low-pass subband. p k ( G ), respectively p m ( G ), denotes the PDF
underlying the SMPs or low-frequency patches
α s and
W k , p , p
{
}
, respectively the motion
{
m p , p
}
patches
, extracted from the GOP G . Finally, D KL denotes the Kullback-
Leibler divergence.
The term of scale k in the sum D s can be interpreted as a measure of how dissim-
ilar local spatial structures are at this scale in the respective key frames of G Q and
G R . Overall, D s indicates whether some objects are present in both frames. Since the
motion patches group together motion vectors and their respective location, the tem-
poral term D t not only tells about how the motions throughout the GOPs compare;
it also tells (roughly) whether similar shapes move the same way in both GOPs.
Let us now see how the Kullback-Leibler divergences involved in the definition
of D can be conveniently estimated from a set of realizations.
3.2.2
Estimation in the kNN Framework
Estimation of the Kullback-Leibler divergence between two PDFs p and q when
these PDFs are known only through two respective sets of realizations U and V ap-
parently requires prior estimation of the PDFs. Because the realizations are vectors
of high-dimension (18 for high-pass and band-pass SMPs, 27 for low-pass patches,
and 16 for motion patches), PDF estimation is afflicted with the curse of dimen-
sionality [34]. Assuming that an accurate parametric model of the PDFs can be
built anyway (for example, a mixture of Gaussians), an analytic expression of the
divergence in terms of the model parameters exists only for some restricted cases
such as mixtures composed of a unique Gaussian. Alternatively, an entropy esti-
mator written directly in terms of the realizations has been proposed in the kNN
framework [35, 36, 37]. Then, the Kullback-Leibler divergence being the difference
between a cross-entropy and an entropy, the divergences D KL involved in (22) can
be expressed as functions of the sets of patches
W k , p , p
{
}
, for each scale k ,and
{
m p , p
}
.
 
Search WWH ::




Custom Search