Image Processing Reference
In-Depth Information
4.1
Introduction
The improvement of 3D video technologies tends to provide enhanced immersive
experiences to the end users [ 1 ]. However, currently available technologies are still
limited in several ways. In stereoscopic 3D, the use of glasses introduces a lack of
comfort, which is combined with unnatural perception stimuli (e.g., when the eyes
converge on an object in front of or behind the screen but accommodate on the
screen) that are annoying for the viewer and can even cause eyestrain and head-
aches. The small number of views displayed with existing glasses-free auto-stereo-
scopic systems induces artifacts. The viewing zone is also restricted by a limited
number of sweet spots and the system cannot provide a smooth motion parallax
(i.e., the visualization is not continuous when moving in front of the display).
A study on Super Multi-View video (SMV) and integral imaging has been
initiated during the October 2013 MPEG FTV meeting [ 2 ]. In these technologies,
tens or hundreds of views are used to obtain a so-called light-field representation of
a scene. A light-field represents the light rays in a 3D scene, and thus is a function of
two angles (ray direction) and three spatial coordinates. This five-dimensional
function is called plenoptic function [ 3 , 4 ]. Some of the current 3D technologies
artifacts can be eliminated using the light-field representation, as, for example, the
vergence-accommodation conflict. As a consequence, several companies are
already working on light-field display systems [ 5 ]. Light-field displays are
glasses-free systems that allow a more realistic visualization. They can provide
smooth motion parallax, which is a key element for the perception of depth, in
horizontal and potentially vertical directions. Immersive telepresence is a typical
target use case, as well as the live 3D broadcast of sport events, like 2020 Olympics
in Japan, that could be shot by SMV camera arrays and projected on large SMV
display systems at several public viewing facilities in major cities around the world.
A large amount of data is needed to create a light-field representation, therefore new
efficient coding technologies are required to handle the increasing number of input
views and exploit the characteristics of their representations [ 6 ].
This chapter is organized as follows. In Sect. 4.2 , we first describe the capture
and display systems associated with the two main technologies that can provide full
parallax 3D video content: SMV and integral imaging. Secondly we give an
overview of view extraction methods that allow to convert an integral image to a
full parallax multi-view representation. In Sect. 4.3 , state-of-the-art methods to
encode integral imaging content are presented, followed by standard encoder
enhancements needed for full parallax SMV content. Finally, we show a new and
efficient inter-view prediction scheme to exploit horizontal and vertical dimensions
at the coding structure level, and improvements of inter-view coding tools to exploit
the two dimensional structure also at the coding unit level. Conclusions are drawn
in section “Conclusion.”
Search WWH ::




Custom Search