Scene Video Coding - Advanced Video Coding Systems

Game Development Reference

In-Depth Information

background model-based fast-and-efficient transcoding (FET) platform with

AVC/H.264 high profile. They transcode the input AVC/H.264 streams at 1,000kbps

for eight sequences (crossroad-sd, overbridge-sd, office-sd, bank-sd, crossroad-cif,

overbridge-cif, snowroad-cif, and snowgate-cif) to the output streams at bit rates

of 64, 128, 256, and 512kbps. The four methods are the Gaussian Mixed Models

(Haque et al. 2008b ) using 1 or 5 models for each pixel (GMM-1 or GMM-5), the

Mean Shift (namely MS) proposed in Liu et al. ( 2007a ), and the popularly used

Gaussian running average (RA). For background modeling in surveillance and con-

ference video transcoding, as is referred in Piccardi's (Piccardi 2004 ), performance,

memory cost, and running time are the same important factors. The calculations for

their memory cost in each pixel position are listed as follows. (1) RA: one current

pixel with type of char and one float-precision mean value for each pixel should be

buffered. (2) GMM-X: besides the buffered input pixel, a GMMmodel is required to

be buffered. The model is composed of double-precision mean value, variance, and

weight. Moreover, an 8-bit value should be stored to count the number of matched

points for each GMM model. (3) MS: Mean shift-based algorithms usually buffer

all the training frames and very few additional temporal variables are used for the

clustering and sorting operations.

To maintain or improve background quality, an ideal solution for background

modeling is to calculate the mean value of all the purely background pixels in the

training frames. However, it is very difficult in recent years to exactly justify which

pixels belong to the background. Physically, background equals to the most fre-

quently appearing content. This inspires FET to utilize a novel segment-and-weight-

based running average (SWRA) to approximately calculate background by paying

larger weight on the frequently appearing values in the averaging process. Because

SWRA is based on a running average procedure, there will not be large memory

cost and computational complexity. Generally, SWRA divides the pixels at a posi-

tion in the training frames into temporal segments with their own mean values and

weights, and then calculates the running and weighted average result on the mean

values of the segments. In the process, pixels in the same segment have the same

background/foreground property and the long segments have larger weight. This

method ignores the foreground/background property of each segment, so foreground

recognition is avoided. Meanwhile, low memory cost and no delay modeling are

guaranteed.

In detail, SWRA models a background value of pixels at position

(

x

,

y

)

by fol-

lowing five steps:

1. Initialization : Initialize background model value AVG and its weight W for the

following weighted average procedure to 0, and then create first segment. Length

of the first segment L equals to 0 and its mean value a

vg =

0. The model value

before the current segment avg' is also set 0.

2. Calculate the threshold for segmenting : Supposing

2 is the

μ

is the mean value,

σ

mean square error, the probability of

|

f

(

x

) − μ | >

2

σ

in normal distribution f(x)

is less than 4%. So we use 2

as the threshold th to temporally segment a pixel

in training frames. The threshold th is initialized to 14 and updated by two times

σ

Advanced Video Coding Systems

Search WWH ::

Custom Search

Home