Semantic Object Segmentation - Video Segmentation and Its Applications

Digital Signal Processing Reference

In-Depth Information

3.1

Introduction

The task of semantic object segmentation is to label each pixel in an image or a

video sequence to one of the object classes with semantic meanings (see examples

in Fig. 3.1 ). The object classes can be predefined or unsupervised learned from a

collection of images or videos. It is different than unsupervised image and video

segmentation, which is to group pixels into regions with homogeneous color or

texture but without semantic meanings. It has important applications to image and

video search, editing, and compression. For example, semantic regions with their

2D spatial arrangement sketched by users can be used as query to retrieve image.

Segmented objects can be deleted from images or copied between images. Differ-

ent regions of images can be enhanced in different ways based on their semantic

meanings.

Semantic object segmentation is a very challenging problem, because there are

a very large number of object classes to be distinguished, some object classes are

visually similar, and each object class may have very large visual variability. These

object classes can be structured, such as cars and airplanes, or unstructured, such

as grass fields and water. Due to variations of viewpoints, poses, illuminations, and

occlusions, objects of the same class have different appearance across images. In or-

der to develop a successful semantic object segmentation algorithm, there are three

important factors to be considered: local appearance, label consistency between

Fig. 3.1 Examples of images ( first row ) and manually segmented objects ( second row ) from

PASCAL VOC 2009 [ 1 ]( a ) and MSRC 21 [ 2 ]( b ). Different colors represent object categories

Search WWH ::

Custom Search

Home