Database Reference
In-Depth Information
Therefore, a two-level bottom-up structure is proposed in this work for efficient
codebook generation. At the bottom layer, individual genre codebooks are generated
in 1st-level K-means clustering. At the upper layer, the 1st-level codebooks are used
as the input for the 2nd-level K-means to build the generic codebook. By using
this bottom-up structure, we reduce the heavy computation in measuring individual
point-to-cluster-center distance in the K-means algorithm. Moreover, since the 1st-
level K-means are independent from each other, distributed computing methods can
be applied to further reduce the computation time. The numerical analysis is referred
to in Sect. 9.4.1 .
Another advantage of bottom-up K-means clustering resides in the system update
and scalability. In the case of new genre videos added to the dataset, a codebook
update module is applied to find the new genre's individual codebook. The result,
together with existing codebooks, is used to generate the new generic codebook by
only re-running the 2nd-level K-means. In the case that new videos are imported
for an existing genre, the corresponding 1st level K-means is applied to achieve the
updated individual codebook; and then, 2nd-level K-means is re-run to update the
generic codebook.
9.2.3
Low-Level Genre Categorization
In our proposed method, at the genre categorization stage, a query video is expressed
as a histogram Q that also uses the generic codebook and the BoW model. Then,
a k-Nearest Neighbor (k-NN) classifier is applied with a defined dissimilarity
measurement between the query Q and a trained individual genre P . Consequently,
the query video is identified as the genre whose distribution is closest to that of the
query within measure. Technical details are presented in Sect. 9.4.1 .
By identifying the genre of this query video, subsequent processes are confined to
a focused group, and the scale of computation is decreased. Therefore, advanced and
sophisticated techniques can be used in middle/high-level video analysis. In the next
step, training data is characterized by frequency-based histogram representation.
The individual genre is modularized as a distribution denoted by P using training
data of its own kind.
9.3
High-Level Event Detection Using Middle-Level
View as Agent
Content-based video event detection is among the most popular quest for high-
level semantic analysis. Different from video abstraction and summarization, which
targets any interesting events happening in a video rush, event detection is only
constrained to a predefined request type (such as the third goal or the second
penalty kick in a particular soccer match). In sports videos, a consumer's interest in
events resides in the actual video contents, more than just the information delivered.
Search WWH ::




Custom Search