Database Reference
In-Depth Information
Table 10.2
Video retrieval results, obtained by LMM-based audio indexing, using 25 queries
Average precision (%)
Music
Ship
Dance
Sound
LMM features
video
Fighting
crashing
party
Dialogues
effects
Average
Seven-level wavelet
decomposition
80.0
77.5
67.2
60.0
92.2
89.6
77.74
Nine-level wavelet
decomposition
83.8
83.8
65
78.8
100
93.8
84.2
10.3
Visual Content Characterization
This section demonstrates the application of the template-frequency model (TFM)
for video indexing. The TFM was discussed in Chap.
3
, but its performance has not
been compared to other methods. In the following, a summary of TFM for video
indexing is given, and a demonstration is performed by applying it to the movie
database.
10.3.1
Visual Indexing Algorithm
The TFM for video indexing is summarized in the following steps:
Step 1
: Template generation. A competitive learning algorithm [
330
,
331
]is
applied to generate prototype vectors,
C
d
, where
c
j
is obtained by modification of input color histograms, and
T
c
is the total number
of prototypes (templates).
=
{
c
1
,...,
c
j
,...,
c
T
c
},
c
j
∈
R
Step 2
: Multiple label vector quantization. For a given video clip, a primary
descriptor is obtained:
D
, where
h
i
is the 48-bin color
histogram vector of the i-th frame, and
T
d
is the total number of frames. Each
vector
h
i
is quantized by the prototype vectors in
C
using multiple labels:
=
{
h
1
,...,
h
i
,...,
h
T
d
}
l
(
h
i
)
1
l
(
h
i
)
2
l
(
h
i
)
k
l
(
h
i
)
j
Q
(
h
i
)=
{
,
,...,
},
∈{
1
,
2
,...,
T
c
}
(10.13)
where
l
(
h
i
)
1
is the label of the best-match template, and
l
(
h
i
)
k
is the label of the
k
-th
best match template.
Step 3
:
TF
T
d
give a set of
labels corresponding to the entire video frames, which are concatenated into
a single weight vector,
f
v
=[
×
IDF
weighting. The resulting
Q
(
h
i
)
,
i
=
1
,...,
t
. The weight parameter
f
1
,...,
f
j
,...,
f
T
c
]
f
j
is
obtained by:
fr
(
c
j
)
N
v
f
j
=
)
}
×
log
(10.14)
{
(
(
)
max
j
fr
c
j
n
c
j
Search WWH ::
Custom Search