Game Development Reference
In-Depth Information
12.3.2 Image and Video Set Compression
Similar images with common pixel distributions, features, and background may
always exist within a same set, such as photo album, medical imaging or satellite
imaging. How to efficiently exploit the correlation among these image or video set to
improve the coding efficiency has long been recognized as an open issue (Musatenko
and Kurashov 1998 ). In image set compression, the main issue is how to build the
prediction structure, so that the external images and videos in the set can get inter-
predicted to reduce the redundancy. In the literature, the existing approaches can be
mainly classified into three categories.
The first approach is to generate or pick out a representative image for prediction,
and then every image can get reference from this image. Various methods have been
applied to generate the prediction image, such as the Karhunen-Loeve transform
(KLT) (Musatenko and Kurashov 1998 ), centroid-based method (Karadimitriou and
Tyl er 1998 ) and low frequency template-based method (Yeung et al. 2011 ). The
advantage of this approach is its low delay and computational complexity, as only
the representative image is need to be decoded before actually decoding the target
image. However, it may lose efficiency as the representative image may not be able
to cover the diversity of image content.
The second approach treats the set of similar images as a video, and apply video
coding algorithm to compress the whole set. Alike video compression, the images
in the image set can be treated as a video sequence, and thus intercorrelation can
be exploited. For example, in Chen et al. ( 2004 ), the authors proposed to organize
the coding order by minimizing the prediction cost. In Au et al. ( 2012 ), the authors
apply the pixel-wise global motion estimation before blockingmotion compensation.
In Zou et al. ( 2013 ), the authors considered employing minimizing spanning tree
(MST) to organize the coding structure, and for each branch, High efficiency video
coding (HEVC) is applied. However, unlike the natural video, the correlation in
the pseudovideo is not only from motion, as the images may be taken at different
locations, viewing points, angles, and even focal lengths. Therefore, traditional block
or global level motion estimation may not be able to well reduce the intercorrelations
in the image set.
The third kind of approach is to employ local features to exploit the correlations
among images. With the feature matching algorithm, similar corresponding regions
can be located with regardless of different scale, location, etc. In Shi et al. ( 2014 ), the
SIFT feature is used to exploit the correlations among the image set. To compress the
pseudosequence, feature-based prediction technique is employed to further reduce
the intercorrelations among the images. Instead of pixel level comparison, the SIFT
descriptors are first matched to exploit the correlation, and then the geometric dispar-
ities between two images can be reduced by the feature-based geometric deformation
process.
In addition to image set compression, video set compression has also been studied
in the literature. Different from the image set, each video in the set may be originated
 
Search WWH ::




Custom Search