Compression - 3DTV: Processing and Transmission of 3D Video Signals

Digital Signal Processing Reference

In-Depth Information

Multi-View Coding (MVC) technologies in October 2004. Some evidences

were recognized in January 2005 and a Call for Proposals on MVC was

officially issued in July 2005. Then, the responses to the Call were evaluated

in January 2006. Finally, MVC was standardized in July 2008.

Several requirements have been set for modern multi-view coding systems.

Some requirements are identified as the compression efficiency, scalability

in viewpoint direction, spatial, temporal and SNR scalability. In addition,

a multi-view coding system must be backward compatible. Coding and

decoding with low delay is another requirement of multi-view coding to

enable real-time collaborative and interactive 3D applications. Since the

visual data load under concern is high, the issue of view, temporal and

spatial random access is also critical and any multi-view coding suit should

have low delay random access characteristics. Besides, the decoder resource

management issue needs to be taken into account when designing a multi-

view coding algorithm. Parallel processing of different views or segments

of the multi-view video is also important to facilitate efficient encoder and

decoder implementations. These issues are essential in particular to deploy

multi-view applications in mobile devices.

The methods applied within the context of exploiting the inter-view

redundancies can be separated into two classes: inter-view redundancy

removal via reference frame-based techniques and inter-view redundancy

removal via disparity-based techniques.

The techniques based on reference frames selected from different views

adapt the existing motion estimation and compensation algorithm to remove

the inter-view redundancies. Basically, the disparity among different views is

treated as motion in the temporal direction and the same methods are applied

to model disparity fields. One example of such a multi-view video coding

method, introduced in [19] and called the Hierarchical B-Frame Prediction

uses the same hierarchical decomposition structure, applied in the temporal

domain, in the view domain.

Figure 3.6 shows the prediction structure used in this specific multi-view

coding method [19]. The horizontally directed arrows in Figure 3.6 represent

referencing in the time domain, whereas the vertically directed arrows in red

represent referencing in the view domain. In this way, a frame being encoded

may have references from both the same view or from neighbouring views.

In particular, the pictures belonging to the highest temporal and view level

(the light blue frames in B-predicted view) are the most efficiently coded

frames inside the multi-view sequence, since they are predicted from both

temporal references and inter-view references. However, more memory is

required for reference frames and more dependent frames must be decoded

before the frame in the highest temporal and view level can be decoded.

One drawback of such an encoding structure is that there is no distinction

between the motion in time and the disparity among the frames of different

views. The motion fields and disparity fields have different characteristics. In

general, motion in time is bounded within a certain search field and changes

3DTV: Processing and Transmission of 3D Video Signals

Search WWH ::

Custom Search

Home