Database Reference
In-Depth Information
Fig. 9.1 A flowchart of the proposed generic framework with one module of generic video
representation and three task modules in sequence
process is the input to the next process in a consistent and coherent fashion. There
are four modules in total: module 0 is the infrastructure for low-level feature
extraction and video representation using the BoW model. Module 1-3 are tasks
introduced in this chapter. The highlights of this framework include the following.
1. A generic foundation using domain knowledge-free local feature was developed
to represent input large-scale videos. This method fits the general framework
in video analysis and provides an alternative solution to alleviate generality,
scalability, and extensibility issues.
2. A thorough and systematic structure starting from genre identification is pre-
sented, which was ignored in some related work that assumed the genre type as
prior knowledge.
3. A general platform is introduced to associate our method with the abundant and
valuable existing literature, as well as various and innovative features input.
At module 0, the low-level local feature utilization incorporated with codebook
generation and the BoW model provides an expandable groundwork for the seman-
tic tasks of genre categorization, view classification, and high-level event detection.
Most of the literature discusses domain knowledge and production rules at the
feature extraction level. In our structure, a homogenous process is first introduced
for extracting domain knowledge-independent local descriptors. The BoW model
is used to represent an input video by mapping its local descriptors to a codebook,
which is generated from an innovative bottom-up parallel structure. The histogram-
based video representation is treated as the sole input (no other feature models) to
both the genre categorization and the view classification modules. Such a concise
Search WWH ::




Custom Search