Scalable Indexing of HD Video - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

To give an intuition of the kNN entropy estimator, let us mention that it can be

considered as the combination of the kNN PDF estimator p with the Ahmad-Lin

entropy approximation H AL [40]

⎧

⎨

⎩

k ( W , x ) δ |

ρ k ( W , x )

w ∈ W

p ( x )=

−

v d ρ

(23)

| w ∈ W log p ( w )

H AL ( W )=

−

d , W is a set of d -dimensional realizations whose un-

where x is an element of

derlying PDF is p ,

is the cardinality of W , v d is the volume of the unit ball in

d ,

ρ k ( W , x ) is the distance between x and its k -th nearest neighbor among the ele-

ments of W ,and

( B ) is equal to 1 if B is true and zero otherwise. Replacing p in

H AL with p leads to a (biased) kNN-based entropy estimator close to the unbiased

version proposed in [35, 36, 37].

Subtracting the kNN entropy estimation from the kNN cross-entropy estimation

leads to the following kNN Kullback-Leibler estimation:

| u ∈ U logρ k ( V , u ) −

| u ∈ U logρ k ( U , u ) .

D KL ( U

V )=log

1 +

(24)

|−

3.3

Scalable Content-Based Queries with Patches Descriptors

In this section we assess the quality of the proposed GOP dissimilarity measure for

the retrieval problem. The experiments were performed on video sequences from

the ICOS-HD project database. After a brief description of the database, we analyze

retrieval results based on spatial frame descriptors alone, temporal/motion descrip-

tors alone, and both sets of descriptors combined together.

3.3.1

ICOS-HD Video Database

The ICOS-HD project provides a large database of both original full HD videos and

edited versions. Each original sequence contains 72 Full HD frames (1920

1080

pixels) and has been manually split up into clips, such that the boundary between

the clips roughly corresponds to a relevant motion transition. In addition, common

geometric and radiometric deformations were applied to the original HD video se-

quences, thus obtaining different versions of each video clip.

For these experiments, we used ten video sequences (see some thumbnails in

Figure 13). The deformations we considered are scaling and quality degradation by

high JPEG2000 compression, for a totla of four different versions of each video clip:

•

original Full HD (1920

1080 pixels), referenced as 1920 in the figures;

•

two rescaled versions (960

540 pixels), referenced as 960 ;

•

two JPEG2000 coded versions (low and very low quality) referenced as jpeg-q1

and jpeg-q10 .

High-Quality Visual Experience

Search WWH ::

Custom Search

Home