Scene Video Coding - Advanced Video Coding Systems

Game Development Reference

In-Depth Information

Table 8.2 Classification of

effect component

Long term

Short term

Constant derivation

S illum , D cam

M bgd

Random derviation

N sys

M obj

for a relative long time period. Take S illum , for example, if the scene is taken in the

indoor environment, and then a light is switched on, in this case the S illum can be dealt

with as a constant addition to C in the following frames. For N sys and M obj , they may

have random values at different time. We call them random deviation effects (time

variant). The analysis above can be summarized as in Table 8.2 . One point should be

made clear here is that this is not a strict classification since it depends on the block

size that we chose. But this will not affect our following analysis essentially.

Since S illum and D cam make long-term constant deviation to the ideal

background C , we can integrate these components into the ideal background as

D cam . An intuitive explanation of this integration is that if the

illumination has changed or camera has been moved, it is reasonable for us to think

that the background (ideal background) has changed. So Eq. ( 8.3 ) can be further

expressed as:

S illum +

C +

V obsv =

N sys +

M obj +

M bgd

(8.4)

Thus far the observation value V obsv can be modeled as the sum of the ideal back-

ground value C and the effect components ( N sys , M obj , M bgd ). These effect compo-

nents will cause different influence on C .

•

N sys takes place over the whole video stream and cause modest deviation to C .So

most of the observed values will not deviate far from C .

•

M obj and M bgd happen only occasionally and may cause great deviation to C . So,

only a minority of the observed values will be different from C dramatically.

The observation is that the pixel values of a spatial location should keep stable

with modest deviation for the most of the time (due to long-term random deviation

N sys ) and significant deviation (due to short-term deviation M obj and M bgd )may

occur only when a moving object passes this location. So the extreme values with

significant deviations only form a minority of the observed values in a time period.

Our task is to find an estimation C of the ideal background C . From the analysis

above, we can see that C should be the center of the region where the majority of the

observed values are located. This task can be accomplished by mean shift procedure.

Here, we call the C as most reliable background model.

From the Eq. ( 8.2 ), the background frame suitable for object detection should be

constantly updated utilizing the original input frame. However, that is not feasible in

surveillance video coding system, because each updated background frame should

be encoded into stream again to guarantee the decoding match. To avoid the burst bit

rate increase caused by updating the background frame, surveillance video coding

framework updates the background frame once in a long period (namely LGOP),

and frames in the next LGOP utilizes the last reconstructed background frame as

prediction reference. Consequently, the best background frame B g in surveillance

video coding for the n frames in the next LGOP satisfies:

Advanced Video Coding Systems

Search WWH ::

Custom Search

Home