Scalable Indexing of HD Video - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

LLY K ( x , y , t )

d t )

| ∑

( x , y )

LLY K ( x + d x , y + d y , t

MAD B (d x , d y )=

−

(9)

∈

Here LLY K is the Y -component of the low frequency subband and B is the consid-

ered block. Estimation with pixel accuracy turns to be better, than half pixel because

of shift-variance of wavelets. Then the global affine six parameters motion model is

estimated by robust weighted least squares:

d x ( x , y )= a 1 + a 2 x + a 3 y

d y ( x , y )= a 4 + a 5 x + a 6 y .

(10)

The outliers with regard to this model with weak weights w ( B ) form the motion

mask M t at the top of the pyramid and serve for extraction of objects O t .When

estimating the model of Eq. (10) the coefficients of the HF subbands are used in

order to a priori exclude “flat areas” in a subband LL, which are not reliable for

motion estimation. Here the standard deviation vector

HH ) T

( B )=(

LH ,

HL ,

is computed for each block. If its norm

|| σ

( B )

|| ∞ is less than a level-dependent

threshold Th k

, then the block is considered as “flat”.

The projection of motion vectors to initialize the estimator at the lower levels

of the pyramid is realized with location principle on the subband LL diadycally

increasing block size and vector magnitudes. The outlier blocks, projected with this

scheme are then split into smaller blocks in order to keep precise motion estimation

in areas with proper motion. The motion model of Eq. (10) re-estimated at each

level of the pyramid allows for improvement of PSNR measured on non-outliers up

to 8% on average.

In filtering of outliers from blocks which follow the model of Eq. (10), the abso-

lute difference between optimal values of MAD obtained when a block is compen-

sated with its original vector and with Eq. (10) is computed. If it is greater than a

threshold Th k MAD , than the “proper” motion of a block is confirmed. Otherwise, it

is incorporated in the set of the blocks following the global motion, the same test

is made for flat blocks. Figure 6 depicts the results of this filtering at the second

resolution level of a Daubechies pyramid. The upper row represents the LL subband

at level 2, the mid-raw is the result of outlier rejection by weighted least squares,

the lower row is the result of filtering.

The merged motion masks and segmentation map at the top of the pyramid form

extracted objects (see an example in Figure 7).

To form a scalable object-based descriptor, it is necessary to get extracted objects

at all levels of the pyramid. The object masks extracted from the top of the pyramid

have to be projected and refined at each level. If the projection across pyramid levels

is naturally guided by wavelet location principle (Figure 5), fitting of object bound-

aries to the LL subband content at the lower pyramid levels is a problem per se. It

is natural try to use already available contour information in HF subbands. This can

be done in the framework of Markov Random Field (MRF) modeling.

High-Quality Visual Experience

Search WWH ::

Custom Search

Home