Hot Research Topics in Video Coding and Systems - Advanced Video Coding Systems

Game Development Reference

In-Depth Information

12.3.2 Image and Video Set Compression

Similar images with common pixel distributions, features, and background may

always exist within a same set, such as photo album, medical imaging or satellite

imaging. How to efficiently exploit the correlation among these image or video set to

improve the coding efficiency has long been recognized as an open issue (Musatenko

and Kurashov 1998 ). In image set compression, the main issue is how to build the

prediction structure, so that the external images and videos in the set can get inter-

predicted to reduce the redundancy. In the literature, the existing approaches can be

mainly classified into three categories.

The first approach is to generate or pick out a representative image for prediction,

and then every image can get reference from this image. Various methods have been

applied to generate the prediction image, such as the Karhunen-Loeve transform

(KLT) (Musatenko and Kurashov 1998 ), centroid-based method (Karadimitriou and

Tyl er 1998 ) and low frequency template-based method (Yeung et al. 2011 ). The

advantage of this approach is its low delay and computational complexity, as only

the representative image is need to be decoded before actually decoding the target

image. However, it may lose efficiency as the representative image may not be able

to cover the diversity of image content.

The second approach treats the set of similar images as a video, and apply video

coding algorithm to compress the whole set. Alike video compression, the images

in the image set can be treated as a video sequence, and thus intercorrelation can

be exploited. For example, in Chen et al. ( 2004 ), the authors proposed to organize

the coding order by minimizing the prediction cost. In Au et al. ( 2012 ), the authors

apply the pixel-wise global motion estimation before blockingmotion compensation.

In Zou et al. ( 2013 ), the authors considered employing minimizing spanning tree

(MST) to organize the coding structure, and for each branch, High efficiency video

coding (HEVC) is applied. However, unlike the natural video, the correlation in

the pseudovideo is not only from motion, as the images may be taken at different

locations, viewing points, angles, and even focal lengths. Therefore, traditional block

or global level motion estimation may not be able to well reduce the intercorrelations

in the image set.

The third kind of approach is to employ local features to exploit the correlations

among images. With the feature matching algorithm, similar corresponding regions

can be located with regardless of different scale, location, etc. In Shi et al. ( 2014 ), the

SIFT feature is used to exploit the correlations among the image set. To compress the

pseudosequence, feature-based prediction technique is employed to further reduce

the intercorrelations among the images. Instead of pixel level comparison, the SIFT

descriptors are first matched to exploit the correlation, and then the geometric dispar-

ities between two images can be reduced by the feature-based geometric deformation

process.

In addition to image set compression, video set compression has also been studied

in the literature. Different from the image set, each video in the set may be originated

Search WWH ::

Custom Search

Home