Digital Signal Processing Reference
In-Depth Information
2. MPEG-4 AND MPEG-7 SYSTEM
SYNERGY
We explore the natural synergy between the MPEG-4 and MPEG-
7, not only using MPEG-4 segmentation results as the basis for object
representation, but also leveraging a priori knowledge from MPEG-7
databases into video object segmentation. As shown in Fig. 7.2, we can
demonstrate these connections with a proposed system that encapsulates
both a video object segmentation system and a video object shape query
system. Voronoi Ordered Spaces are the common theoretical thread be-
tween our MPEG-4 and MPEG-7 systems that can draw the two systems
together. In each of the systems, Voronoi Ordered Spaces of Chapter 3
plays a major role in both systems by integrating shape information
into a segmentation algorithm and extracting a robust representation
of a shape. Centered around the Voronoi Ordered Spaces, the two sys-
tems can be unified by the transmission of shape information from the
knowledge database, to the extraction of segmentation, and then back
into improved object representation. By placing each system within
a feedback loop, we propose a experimental system that can improve
the quality of both its segmentation results and its knowledge database.
Linked together in such a way, the two systems form an adaptive/neural
system that begins with human guidance, but can adapt and grow on
its own later on, much like the dynamic system of the human mind.
We hope to demonstrate the possibility of meta-system that learns,
improving segmentation quality and, in turn, its own knowledge database.
For example, we begin with an unclassified video sequence of a horse.
Without any a priori knowledge, the segmentation system bootstraps
by using low-level information such as optical flow segmentation to first
determine the general blob of the horse and extracts an estimate of
visual characteristics (for this system in particular, we concentrate on
shape, but color, size, texture, and speed are also valid). From this first
estimate, we query our video database for any archived skeletons that
are consistent with the estimate. From this query, (ideally) we receive a
skeleton of a horse and integrate that information to iteratively refine our
segmentation result. When our segmentation result has converged, we
archive a good quality segmentation in our video database and improve
both segmentation and classification of future unknown sequences.
3. INTELLIGENT VIDEO TRANSPORT
In the network of the near-future (see Figure 7.3), video will be as
ubiquitous as telephony, requiring Intelligent Video Transport, i.e., us-
ing content information to predict and adapt to changing network condi-
 
Search WWH ::




Custom Search