Database Reference
In-Depth Information
results in a valuable information exchange between the user and the computer,
programming the computer system to be self-learning is highly desirable.
Consequently, the interactive retrieval system of Fig. 1.1 a is generalized to
include a self-learning component, as shown in Fig. 1.1 b. The interaction and
relevance feedback modules are implemented in the form of specialized neural
networks. In these fully automatic models, the learning capability associated with
the networks and their ability to perform general function approximations offers
improved flexibility in modeling the user's preferences according to the submitted
query.
Pseudo RF offers multimedia retrieval in fully automatic and semi-automatic
modes, which allow: (1) avoidance of errors caused by excessive human involve-
ment, (2) utilization of unlabeled data to enlarge training sets, and (3) minimization
of iterations in RF. These properties are highly desirable for multimedia retrieval in
a cloud-data center.
The relevant topics include: Pseudo-RF method, implemented by the
self-organizing tree map, Compressed domain features, Energy histograms of
discrete cosine transformation (DCT), Multi-resolution histograms of wavelet
transformation, Re-ranking of images based on knowledge of region-of-interest,
and Re-ranking of videos using the adaptive cosine network.
1.3.2
Internet Scale Multimedia Analysis and Retrieval
In order to cope with large scale multimedia classification and retrieval, this topic
presents the adoption of the bag-of-words (BoW) model for the analysis of images
and videos. A BoW model can effectively combine the locally extracted feature
vectors of either an image or a video frame. It focuses on the characteristics of
the local feature ensemble, and treats individual local descriptors uniformly. The
merits of the BoW model include the homogenous process in which it compactly
represents images or video frames for classification, as well as its usability for large-
scale image retrieval due to its success in text retrieval. The relevant topics this topic
will be presented as follows.
1.3.2.1
BoW in Unsupervised Classification and Video Analysis
The first topic describes the BoW model for unsupervised classification in video
analysis. A distinguishing yet compact representation of the video clip is constructed
using the BoW model. Candidate videos are indexed and represented as a histogram-
based interpretation using the learned BoW model. The advantage of using the BoW
model is that labeled data is not required. Therefore, video analysis can be realized
for large-scale applications.
Chapter 8 of this topic presents a systematic and generic approach by using
the BoW based video representation. The system aims at event detection in
Search WWH ::




Custom Search