Image/Video Segmentation: Current Status, Trends, and Challenges - Video Segmentation and Its Applications

Digital Signal Processing Reference

In-Depth Information

As an important low-level feature, motion can provide the otherwise missing

semantic information in cases where uniform motion is expected. In order to ex-

tract the moving objects, motion estimation methods are needed especially when

change detection masks have been shown to be ineffective. An interesting work on

moving object segmentation can be referred to [ 2 ]. The core of this algorithm is

an object tracker that matches a two-dimensional (2D) binary model of the object

against subsequent frames using the Hausdorff distance. To achieve this goal, the

first step is to detect a dominant global motion that can be assigned to the back-

ground based on the six-parameter affine transformation. An object tracker based

on Hausdorff distance is then established to measure the temporal correspondence

of objects and enhance the robustness to noise and changes in shape in the video

sequence.

1.3

Technological Trends for Image/Video Segmentation

Most past research activities on video segmentation have relied on two principles of

spatial (i.e., image) and temporal segmentation. If we treat the motion cue as one of

the low level features such as intensity, color, and texture, many image segmentation

algorithms can be easily extended to video segmentation. For example, to segment

a moving object out from a video clip, a 3D graph cut was presented to partition

watershed presegmentation regions into foreground and background while preserv-

ing temporal coherence. For each frame, the segmentation in each tracked window

is refined using a 2D graph cut based on a local color model [ 36 ]. In this section,

we will address the following trends for segmentation algorithm especially for the

spatial domain segmentation.

1.3.1

Towards 'Good' Segmentation

An emerging trend is to answer the question “What is a good partition for an

image?” An interesting work in the current literature is to group pixels into “su-

perpixels”, which are local, coherent, and which preserve most of the structure

necessary for segmentation at the scale of interest [ 13 , 37 ]. To generate the super-

pixel map, the Ncut segmentation algorithm is used by incorporating the contour

and texture cues. To find the “good” segmentation, the gestalt grouping cues, such

as contour, texture, brightness, and good continuation are combined in a principled

way. A linear classifier is trained to combine these features.

An example of superpixel segmentation is shown in Fig. 1.5 , which has the num-

ber of superpixels 200. The original image flower is shown in Fig. 1.5 a, which has

the superpixel map given in Fig. 1.5 b. A result of segmentation can be found in

Fig. 1.5 c, which shows that distinct improvement can be achieved with respect to

those classic methods.

Search WWH ::

Custom Search

Home