Survey of Dirac: A Wavelet Based Video Codec for Multiparty Video Conferencing and Broadcasting - Intelligent Video Event Analysis and Understanding

Information Technology Reference

In-Depth Information

Interlace coding

Dirac supports interlace coding by coding sequences of fields, rather than frames.

5.3.2 Motion Estimation

Dirac uses hierarchical motion estimation that consists of three stages. In the first

stage, motion vectors are evaluated for every block of each frame to one pixel

accuracy using hierarchical motion estimation. In the second stage, these vectors

are refined to sub-pixel accuracy. In the third stage, mode decision is performed in

which motion vectors are aggregated by grouping blocks for similar motion.

Motion estimation is most accurate when all three components are involved, but

this is more expensive in terms of computation as well as more complicated

algorithmically. Dirac uses the luma (Y) component only for ME.

5.3.2.1 Finding Motion Vectors of One Pixel Accuracy Using Hierarchical

Motion Estimation

Hierarchical ME speeds things up by repeatedly down converting both the current

and the reference frame by a factor of two in both dimensions, and doing motion

estimation on smaller pictures. In hierarchical motion estimation, Dirac first

determines the number of down conversion levels that can be calculated using

equation 1 as follows:

⎛

width

height

⎞

⎛

⎞

⎛

⎞

level

=

min

⎜ ⎜

⎝

log

,

log

⎟ ⎟

⎠

⎜ ⎝

⎟ ⎠

⎜ ⎝

⎟ ⎠

(1)

2

12

In equation 1, the number of down conversion levels is 4 and 3 for the CIF

(352*288) and QCIF (176*144) frame format respectively. At each level, the

process of down conversion reduces the height and width of the current and

reference frame by a factor of 2 in each dimension. The size of the frame becomes

one quarter at each level. Hence CIF image of size 352*288 and QCIF image of

size 176*144 is reduced to 22*18 respectively in the last level.

The motion estimation is started from the lowest level resolution (level 4 in

CIF) frame and gradually moved to higher level resolutions and finally reaches the

original frame size. At each level of the hierarchy, except the smallest level,

vectors from lower levels are used as a guide for searching at higher levels. Each

block at the lower resolution level corresponds to four blocks at immediate higher

resolution level, so each block at the lower resolution level provides a guide

motion vector to at most 4 blocks at immediate higher resolution level. The block

sizes are variable. The middle blocks are of size 12*12. The other block sizes are

10*10, 10*12, 10*6, 10*4, 10*8, 8*10, 6*10 or 4*10 depending upon the location

of the block and the size of the frame and these are consistent at each level of the

motion estimation hierarchy.

Intelligent Video Event Analysis and Understanding

Search WWH ::

Custom Search

Home