Database Reference
In-Depth Information
Table 10.2
Video retrieval results, obtained by LMM-based audio indexing, using 25 queries
Average precision (%)
Music
Ship
Dance
Sound
LMM features
video
Fighting
crashing
party
Dialogues
effects
Average
Seven-level wavelet
decomposition
80.0
77.5
67.2
60.0
92.2
89.6
77.74
Nine-level wavelet
decomposition
83.8
83.8
65
78.8
100
93.8
84.2
10.3
Visual Content Characterization
This section demonstrates the application of the template-frequency model (TFM)
for video indexing. The TFM was discussed in Chap. 3 , but its performance has not
been compared to other methods. In the following, a summary of TFM for video
indexing is given, and a demonstration is performed by applying it to the movie
database.
10.3.1
Visual Indexing Algorithm
The TFM for video indexing is summarized in the following steps:
￿
Step 1 : Template generation. A competitive learning algorithm [ 330 , 331 ]is
applied to generate prototype vectors, C
d , where c j
is obtained by modification of input color histograms, and T c is the total number
of prototypes (templates).
= {
c 1 ,...,
c j ,...,
c T c },
c j R
￿
Step 2 : Multiple label vector quantization. For a given video clip, a primary
descriptor is obtained: D
, where h i is the 48-bin color
histogram vector of the i-th frame, and T d is the total number of frames. Each
vector h i is quantized by the prototype vectors in C using multiple labels:
= {
h 1 ,...,
h i ,...,
h T d }
l ( h i )
1
l ( h i )
2
l ( h i )
k
l ( h i )
j
Q
(
h i )= {
,
,...,
},
∈{
1
,
2
,...,
T c }
(10.13)
where l ( h i )
1
is the label of the best-match template, and l ( h i )
k
is the label of the k -th
best match template.
￿
Step 3 : TF
T d give a set of
labels corresponding to the entire video frames, which are concatenated into
a single weight vector, f v =[
×
IDF weighting. The resulting Q
(
h i ) ,
i
=
1
,...,
t . The weight parameter
f 1 ,...,
f j ,...,
f T c ]
f j is
obtained by:
fr
(
c j )
N v
f j =
) } ×
log
(10.14)
{
(
(
)
max
j
fr
c j
n
c j
Search WWH ::




Custom Search