Information Technology Reference
In-Depth Information
desirable property in the developed technology. It stands for the adaptability of the
data encoding and delivery process to different temporal and spatial resolutions that
may be imposed by specific network properties. It has become a focus of interest in
HD video coding technologies which led to coding standards such as SVC, H.264
or (M)JPEG2000 [1] and is also of interest in post-processing technologies. Hier-
archical transforms (e.g. wavelet transforms) are not only efficient tools to describe
and compress video content but they also naturally yield a scalable description of
this content. They are thus natural candidates to help defining scalable indexing
methods.
In this chapter, we present the first research works on scalable HD video in-
dexing methods in the transformed domain. The first method extracts information
directly from the compressed video-stream while the second deals with raw data.
The standards in video coding (e.g. SVC etc.) indeed use a hierarchical transform
to compress the data. The design of descriptors directly extracted from this domain
thus ensures the scalability of the proposed method while allowing for a coherent
and fast processing of the data. In this framework, we propose two video indexing
and retrieval methods.
The first part of the chapter focuses on indexing in the compressed domain. We
give an overview of the transforms used and then present the methods which aim
at exploring the transform coefficients to extract from the video stream, at different
levels of decomposition, meaningful features such as objects [2] or visual dictionar-
ies [3]. The resulting descriptors will be used for video partitioning and retrieval.
First we will introduce a system based on Daubechies wavelets and designed for
joint indexing and encoding of HD content by JPEG2000-like encoders. Here we
will present an overview of emerging methods such as [3] and develop our previous
contributions [2, 4].
In the second part of this chapter, we describe a video indexing method that
builds up on a hierarchical description of the decoded data. Spatial and respectively
temporal descriptors of the video content are defined, that rely on the coherence of
a wavelet description of the key-frames and respectively the motion of blocks in the
video. The method builds on the redundancy of the descriptors (induced, in part,
by the transform used which is the Laplacian pyramid) to statistically compare two
videos. The invariance properties of the descriptors as well as the statistical point of
view allow for some robustness to geometric and radiometric alterations.
2
HD Content Indexing in the Compressed Domain
2.1
Scalable Compression Standards
The scalability of a representation of the video content is the property which
has been introduced in multimedia standards since MPEG2. It means that various
temporal and spatial resolutions of a video stream and also different qualities of
videos can be decoded from the same code-stream. In the first case, this is a tem-
poral scalability, in the second, spatial scalability and finally, the SNR scalability is
Search WWH ::




Custom Search