Digital Signal Processing Reference
In-Depth Information
3.1
Introduction
The task of semantic object segmentation is to label each pixel in an image or a
video sequence to one of the object classes with semantic meanings (see examples
in Fig. 3.1 ). The object classes can be predefined or unsupervised learned from a
collection of images or videos. It is different than unsupervised image and video
segmentation, which is to group pixels into regions with homogeneous color or
texture but without semantic meanings. It has important applications to image and
video search, editing, and compression. For example, semantic regions with their
2D spatial arrangement sketched by users can be used as query to retrieve image.
Segmented objects can be deleted from images or copied between images. Differ-
ent regions of images can be enhanced in different ways based on their semantic
meanings.
Semantic object segmentation is a very challenging problem, because there are
a very large number of object classes to be distinguished, some object classes are
visually similar, and each object class may have very large visual variability. These
object classes can be structured, such as cars and airplanes, or unstructured, such
as grass fields and water. Due to variations of viewpoints, poses, illuminations, and
occlusions, objects of the same class have different appearance across images. In or-
der to develop a successful semantic object segmentation algorithm, there are three
important factors to be considered: local appearance, label consistency between
Fig. 3.1 Examples of images ( first row ) and manually segmented objects ( second row ) from
PASCAL VOC 2009 [ 1 ]( a ) and MSRC 21 [ 2 ]( b ). Different colors represent object categories
Search WWH ::




Custom Search