Digital Signal Processing Reference
In-Depth Information
etc.) rather than high-level knowledge about the object of interest. The result-
ing segmentation is usually implemented in unsupervised manner. The second
method usually requires a database of human-annotated images to learn a prior
distribution, which helps to make high-level recognition by incorporating low-
level grouping results.
(5) Space-based mode : Based on the view of space relation, we can classify seg-
mentation into spatial or temporal methods. The first method focuses on the
partition according to the spatial relations among pixels, while the second aims
to divide a sequence of frames into several segments along the temporal axis.
For example, we can use scene analysis techniques such as video cut, fade,
wide, zoom, etc. to perform the scene segmentation so as to group those frames
with similar content.
(6) Class-based mode : Many segmentation methods are proposed to extract specific
objects (e.g., face, human, car, or building) from input images/videos. Since the
object is known in advance, the prior information for this object can be used to
improve the segmentation results. For example, for the face segmentation, the
skin color distribution observed from samples is very helpful for the face region
detection, which allows to access the face efficiently.
(7) Semantic-specific mode : Unlike the non-semantic segmentation that extracts
some uniform and homogeneous segments with respect to texture or color fea-
tures, semantic segmentation can be defined as a process that typically divides
an image into meaningful segments associated with some semantics.
Notice that the existing segmentation methods can be classified into certain
categories based on above analysis. Of course, there is no distinct boundary to
distinguish different segmentation modes, which means that one can develop a seg-
mentation method by combining different modes. For example, unsupervised over
segmentation is usually employed as an important step for the top-down segmenta-
tion method that groups those segments into a semantic object.
Because image segmentation is application oriented, it is very difficult to mea-
sure a given segmentation quality based on an uniform criteria. This means that
“what is a good segmentation?” and “how do we distinguish good segmentations
from bad segmentations?” highly depend on the application scenarios. Therefore,
many researchers answer the above questions by making some assumptions for the
goodness of the segmentation, such as the principle of good continuation states that
a good segmentation should have [ 13 ].
The goal of this chapter is to review theoretically and practically different meth-
ods for image/video segmentation. To achieve this goal, we focus our attention to
the task of image/video segmentation only. It may be helpful to the reader to know
that there have been many other articles that have reviewed the image segmentation
from a variety of perspectives in last decade, such as reviews of image segmentation
techniques [ 14 , 15 ], a survey of ultrasound image segmentation [ 16 ], an overview of
video segmentation [ 17 ]. In this chapter, we not only consider the existing methods
that are classics or milestones in the field, but the trends and challenges, which may
promote future research work.
Search WWH ::




Custom Search