Digital Signal Processing Reference
In-Depth Information
dual problem representation [Wang, 1995]. However, it is unclear how
this algorithm extends multiple frames. Mesh-based analysis divides the
frame into a mesh of small triangular regions and does individual track-
ing of each region [Altunbasak et al., 1997]. Although the work tracks
the object membership of each region well, the initial determination of
membership of each region is not a well-understood problem.
In the current literature, Choi's work in video object segmentation is
the most impressive in its theory and system design. His system design
begins with the usual motion-based segmentation algorithm, progresses
to the integration of visual information and, finally, expands the mod-
eling of the motion information by a hierarchy of motion models [Choi
et al., 1995] [Choi et al., 1997b] [Choi et al., 1997a] [Choi and Kim,
1996]. Our own work has progressed in a similar manner, but with a
different type of computational framework.
2. PROBLEM SIMPLIFICATIONS
We simplify some aspects of the MPEG-4 video object segmentation
to define the scope of our analysis. The general problem description for
video object segmentation is as follows: given a video sequence, parti-
tion the video sequence into video objects, i.e., such that each partition
contains the projection of only one physical object (see Figure 4.2) [Com-
mittee, 1998] [Wang and Adelson, 1994]. The following simplifications
clearly classify the future challenges either 1) as an inherent system
drawback or 2) as a system extension that can be temporarily solved by
human guidance:
1.
The video sequence is grayscale.
Although color information may help to distinguish the objects from
background, we only treat grayscale images. Color information is an
extension of the dimensionality of this feature space. Exploiting color
for segmentation involves the study of perceptual qualities of colors
and is beyond the scope of this topic. In the future, color or texture
analysis may integrated into our system in Eq. 4.5.
2.
The video sequence contains the same objects throughout
its run, i.e., no object completely leaves or enters the video
sequence. For our system, we rely upon manual object detection.
In the future, this analysis may be done automatically.
3.
The video objects to be extracted have boundaries that are
not primarily defined by the camera frame. The background
objects (such as the sea and land in Figure 4.2b) have different char-
acteristics than foreground objects and should be treated as sepa-
 
Search WWH ::




Custom Search