Scalable Indexing of HD Video - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

Fig. 12 Building a motion

patch

at p =( x , y ) (the size of the blocks corresponds to the lowest scale of the multiscale

decomposition used for the spatial description). More precisely, for a GOP of n

consecutive frames f 1 ,..., f n , we compute the following motion patches for each

block of center (x,y) :

m (x,y) = x , y , u 1 , 2 (x,y) , u 2 , 3 (x,y) ,..., u n − 1 , n (x,y)

(20)

where u n − 1 , n (x,y) is the apparent motion of the block centered at (x,y) from frame

f n − 1 to frame f n (see Fig. 12). The motion vectors u are computed via a diamond-

search block matching algorithm. For each GOP studied, we compute the motion

patches m (x,y) for each block (x,y) . Note that we include in the motion patch its

location (x,y) so that each patch has length 2 n (which is 16 for GOPs of 8 frames).

As is the case for spatial patches, in fact only a few motion patches effectively de-

scribe motion (sparsity). Thus, we select the significant motion patches by a thresh-

olding that keeps only the patches having the largest motion amplitude (sum of

squares of the u components in Eq. (20)). (The threshold value used in Section 3.3 is

zero: the motion patches kept are those for which the motion amplitude is non-zero).

3.2

Using the Kullback-Leibler Divergence as a Similarity

Measure

3.2.1

Motivation and Expression

As mentioned in Section 3, the comparison between two HD video segments is

performed by statistically measuring the dissimilarity between their respective sets

of (spatial and temporal) descriptors within the successive GOPs. Indeed, the scale

and location of the descriptors extracted in each segment will not match in general

even if the segments are visually similar. Therefore, a dissimilarity based on one-

to-one distance measures is not adequate. Instead, it is more appropriate to consider

each set of descriptors as a set of realizations of a multidimensional random vari-

able characterized by a particular probability density function (PDF), and to mea-

sure the dissimilarity between these PDFs. Because the descriptors were defined in

high-dimensional spaces, PDF estimation is problematic. The k -th nearest neighbor

(kNN) framework provides interesting estimators in this context [32, 33, 34]. First,

they are less sensitive to the curse of dimensionality. Second, they are expressed

directly in terms of the realizations. Besides a PDF estimator, a consistent, asymp-

totically unbiased entropy estimator has been proposed [35, 36, 37]. To compare two

PDFs in this framework, entropy-based measures then appear as a good option. We

chose the Kullback-Leibler divergence because it proved to be successful in similar

Search WWH ::

Custom Search

Home