Image Processing Reference
In-Depth Information
3.3.4 Non-scalable 3DHV Coding
In order to transmit 3DHV content over limited-bandwidth networks with an
adequate quality, efficient video coding tools are needed that fully exploit the
inherent spatial and temporal correlations existing in this type of content.
The planar intensity distribution projected behind the micro-lens array, which
represents 3DHV frames, consists of a simple 2D array of micro-images of m
n
pixels. This is due to the structure of the micro-lens array that is used for capturing
this type of content. As such, this 2D array could be simply encoded by any 2D
image or video encoder. However, each micro-lens can be viewed as an individual
small low-resolution camera, recording a different perspective of the video scene, at
slightly different angles. Due to the small angular disparity between adjacent
micro-lenses, a significant cross-correlation exists between neighboring micro-
images (see Fig. 3.14 ). Therefore, this inherent cross-correlation of 3D holoscopic
images can be exploited for improving coding efficiency. Additionally, a significant
correlation also exists between neighboring pixels within each micro-image.
3.3.4.1
3D-DCT Encoding
Early schemes for 3D holoscopic image and video encoding proposed in the
literature were based on the three-dimensional discrete cosine transform
(3D-DCT) [ 55 - 58 ]. These schemes take advantage of the existing redundancy
within the micro-images (i.e., images formed behind each micro-lens), as well as
the redundancy between adjacent micro-images, by applying the 3D DCT to a stack
of several micro-images.
Other schemes rely, additionally, on the discrete wavelet transform (DWT)
[ 59 , 60 ]. For example,
in [ 59 ], 3D holoscopic images are decomposed into
Fig. 3.14 3D holoscopic video frame and image enlargement showing the repetitive structure due
to the micro-lens array
Search WWH ::




Custom Search