Digital Signal Processing Reference
In-Depth Information
regions of interest [ 2 ]. In the past several years, there has been rapid growing interest
in content-based applications of video data including video retrieval and browsing,
video summarization, video event analysis, and video editing. The requirements
for efficiently accessing a great amounts of multimedia content are becoming more
and more important. However, how to obtain semantic contents successfully from
an image/video is still a very challenging task in the computer vision and pattern
recognition.
In order to understand the scene content, we need to known what is the basic com-
ponent for such content. The common answer may be the semantic object, which
represents a data item together with its underlying semantic context. It may consist
of a flexible set of meta-attributes that explicitly describe the implicit assumptions
about the meaning of the data item [ 4 ]. Each semantic object should clearly specify
the relationship between the object and the real aspects. Therefore, a crucial step
before the image understanding is to separate the image/video into several con-
stituent parts.
In general, segmentation can be defined as the process of partitioning data into
groups of potential subsets that share similar characteristics. It has become a key
technique for semantic content extraction and plays an important role in digital
multimedia processing, pattern recognition, and computer vision. The goal of image
segmentation is very application oriented, which emerges in many fields. A limited
set of applications of image/video segmentation can be presented as follows:
Object recognition , where the segmentation is treated as a key component that
groups coherent image areas that are then used to assemble and detect ob-
jects [ 5 ]. As important recognition tasks, feature extraction and model matching
rely heavily on the quality of the image segmentation process. When an image is
segmented into several homogeneous intensity regions, each region can be used
as features for deriving the category model since they are rich descriptors, usually
stable to small illumination and viewpoint changes [ 6 ].
Video monitoring , where an object can be divided into pieces to improve tracking
robustness to occlusion by tracking the evolution of the moving objects along the
time axis [ 7 ]. The segmented mask allows to predict and identify an intruder or
of an anomalous situation, and help to reveal their behaviors and make quick
decision when “alerts” should be posted to security unit.
Video indexing , which performs over segments of the media using the annotations
associated with the segments [ 8 , 9 ]. An ordered list of segments associated with
the query object will be returned to user, which has been applied to the content
classification, representation, or understanding.
Data compression , which allows suitable coding algorithm to manipulate each
object independently resulting in subjective quality improvement. Segmentation
is used to partition each frame of a video sequence into semantically meaningful
objects with arbitrary shape. More coding bits can be assigned to these object
regions [ 10 ], which can reduce visual artifacts after the low-bit rate coding.
Computer vision , where segmented objects from the input 2-D images or video
sequences can be used to construct the 3-D scene. For example, stereo for
image-based rendering was proposed based on image oversegmentation. Since
Search WWH ::




Custom Search