Database Reference
In-Depth Information
P
Let x i
R
represent a visual descriptor of frame f i . A video interval I
[
f s
,
f e
]
at any level is characterized by a set of video descriptors represented by D I
=
{ (
. denotes a set of primary descriptors of I . It will
be used for obtaining a secondary descriptor used for the video indexing.
Intuitively, a video descriptor database VD for a video F is defined as a set of
video descriptors for F and has the following form:
x s
,
f s
) , (
x s + 1
,
f s + 1
) ,..., (
x e
,
f e
) }
VD
= { (
D I 1 ,
I 1 ) , (
D I 2 ,
I 2 ) ,..., (
D I J ,
I J ) }
(7.12)
Based on Eq. ( 7.12 ), video descriptor databases at the shot, group, and story levels
are defined as follows:
VD Shot = { (
D I i ,
I i ) |
I i
I Shot (
F
) }
(7.13)
VD Group = { (
D I i ,
I i ) |
I i
I Group (
F
) }
(7.14)
VD St ory = { (
D I i ,
I i ) |
I i
I St ory (
F
) }
(7.15)
In the above definitions, D I is regarded as the set of primary descriptors, and it is
only used to characterize video at the frame level. In order to obtain video indexing,
it will be reorganized into a higher level as a set of secondary descriptors.
7.3.2
Indexing and Retrieval of News Video
For a video descriptor database VD
= { (
D I 1 ,
I 1 ) ,..., (
D I j ,
I j ) ,..., (
D I J ,
I J ) }
, where
D I = { (
, the indexing process produces a secondary
video descriptor for each interval I j , specified as D I j
x s ,
f s ) , (
x s + 1 ,
f s + 1 ) ,..., (
x e ,
f e ) }
t .
The weights w jr are positive and non-binary. They are obtained by the template
frequency model (TFM) discussed in Sect. 3.5 , Chap. 3 .
Since the template-frequency model considers all the visual contents occurring
in a video sequence (with the weight w jr ), this indexing technique can be applied to
characterize video sequences at different levels, from shot, group of shots, to story
levels. This allows for the system to facilitate the user's access to various levels
as depicted in Fig. 7.3 : (a) shot-to-shot, (b) shot-to-group, (c) group-to-group, (d)
group-to-story, and (e) shot-to-story.
This architecture is able to accommodate retrieval from the lower to higher levels,
e.g., retrieval of a video group or story by using a query from the shot or group
levels. A user is generally seeking information across the different levels defined in
the segmented videos. To satisfy this demand, it is expected that at a higher level,
the video story should contain most of the visual contents occurring at the lower
one. For instance, to retrieve a full news story, a small shot that contains the anchor
of the news story can be utilized as a query.
v j =[
w j 1 ,...,
w jr ,...,
w jR ]
Search WWH ::




Custom Search