Information Technology Reference
In-Depth Information
concerned. Two latest standards designed for HD video and film content have this
property: H.264 and motion JPEG2000 known as (MJPEG2000 [1]. While H.264
has been designed for HD TV and continues the principles of previous standards
in the sense that the transform used (Integer Block Transform) is a variation of a
DCT, which does not have the property of scalability, (M)JPEG2000 standard has
this property naturally, due to the scalable nature of the transform used: the Discrete
Wavelet Transform (DWT).
(M)JPEG2000 is a part of JPEG2000 standard for motion sequences of images.
Nevertheless, contrary to H. 264 it does not contain motion information, each frame
being encoded in an intra-frame mode by JPEG2000. In the following, we give the
insights to JPEG2000 standard [5].
2.1.1
(M)JPEG2000 Standard
Initiated in march 1997 and becoming international ISO standard in December
2000, the standard JPEG2000 exhibited a new efficiency with regard to specifi-
cally high-resolution (HD) images. The specifications of DCI (Digital Cinema Ini-
tiative, LLC [6]) made (M)JPEG2000 the digital cinema compression standard.
(M)JPEG2000 is the extension of the standard JPEG2000 for videos: each frame
in the video sequence is considered separately and encoded with JPEG2000. Fur-
thermore (M)JPEG2000 is becoming the common standard for archiving [7] cultural
cinematographic and video heritage with the greater quality/compression compro-
mise than previously used solutions. The JPEG2000 standard follows the ideas ini-
tially proposed in MPEG4 [8] for object-based coding, namely the possibility to
encode more precisely Regions of Interest (ROI) in each frame or in a single im-
age. The industrial reality in the usage of this advanced feature in JPEG2000 turned
to be pretty much the same as with MPEG4. Despite the very rich research work
proposing various methods for extraction of ROI (e.g. [9, 10]), the most commonly
used JPEG2000 limits to encoding the whole frame. More precisely, an image frame
is modeled as a set of tiles on which the coding process performs independently as
depicted in Figure 1, a frame being considered as a single tile.
The core of the standard is the DWT which in case of lossy compression is real-
ized by High-Pass and Low-Pass filters designed for zero-mean signals. This is why
the Level offset is necessary at the pre-processing step. Furthermore, the standard
operates on YCrCb color system, hence if the source is in RGB, a linear transform
has to be applied. Then the resulting frame undergoes the DWT which we describe
below. The transform coefficients are quantized to reduce the quantity of informa-
tion and entropic coding known as EBCOT (Embedded Block Coding with Opti-
mized Truncation) is performed on these quantized values. At the first step (Tier 1)
context modeling is realized, at the second step (Tier 2) the bit allocation for output
bit stream is performed.
The decoder proceeds in an inverse order to decode the original frame. In the
lossy scheme the original pixel values cannot be recovered, but the quantization
matrix is designed in a way to take into account psycho-visual properties of Human
 
Search WWH ::




Custom Search