Digital Signal Processing Reference
In-Depth Information
Multi-View Coding (MVC) technologies in October 2004. Some evidences
were recognized in January 2005 and a Call for Proposals on MVC was
officially issued in July 2005. Then, the responses to the Call were evaluated
in January 2006. Finally, MVC was standardized in July 2008.
Several requirements have been set for modern multi-view coding systems.
Some requirements are identified as the compression efficiency, scalability
in viewpoint direction, spatial, temporal and SNR scalability. In addition,
a multi-view coding system must be backward compatible. Coding and
decoding with low delay is another requirement of multi-view coding to
enable real-time collaborative and interactive 3D applications. Since the
visual data load under concern is high, the issue of view, temporal and
spatial random access is also critical and any multi-view coding suit should
have low delay random access characteristics. Besides, the decoder resource
management issue needs to be taken into account when designing a multi-
view coding algorithm. Parallel processing of different views or segments
of the multi-view video is also important to facilitate efficient encoder and
decoder implementations. These issues are essential in particular to deploy
multi-view applications in mobile devices.
The methods applied within the context of exploiting the inter-view
redundancies can be separated into two classes: inter-view redundancy
removal via reference frame-based techniques and inter-view redundancy
removal via disparity-based techniques.
The techniques based on reference frames selected from different views
adapt the existing motion estimation and compensation algorithm to remove
the inter-view redundancies. Basically, the disparity among different views is
treated as motion in the temporal direction and the same methods are applied
to model disparity fields. One example of such a multi-view video coding
method, introduced in [19] and called the Hierarchical B-Frame Prediction
uses the same hierarchical decomposition structure, applied in the temporal
domain, in the view domain.
Figure 3.6 shows the prediction structure used in this specific multi-view
coding method [19]. The horizontally directed arrows in Figure 3.6 represent
referencing in the time domain, whereas the vertically directed arrows in red
represent referencing in the view domain. In this way, a frame being encoded
may have references from both the same view or from neighbouring views.
In particular, the pictures belonging to the highest temporal and view level
(the light blue frames in B-predicted view) are the most efficiently coded
frames inside the multi-view sequence, since they are predicted from both
temporal references and inter-view references. However, more memory is
required for reference frames and more dependent frames must be decoded
before the frame in the highest temporal and view level can be decoded.
One drawback of such an encoding structure is that there is no distinction
between the motion in time and the disparity among the frames of different
views. The motion fields and disparity fields have different characteristics. In
general, motion in time is bounded within a certain search field and changes
Search WWH ::




Custom Search