Game Development Reference
In-Depth Information
Table 8.2 Classification of
effect component
Long term
Short term
Constant derivation
S illum , D cam
M bgd
Random derviation
N sys
M obj
for a relative long time period. Take S illum , for example, if the scene is taken in the
indoor environment, and then a light is switched on, in this case the S illum can be dealt
with as a constant addition to C in the following frames. For N sys and M obj , they may
have random values at different time. We call them random deviation effects (time
variant). The analysis above can be summarized as in Table 8.2 . One point should be
made clear here is that this is not a strict classification since it depends on the block
size that we chose. But this will not affect our following analysis essentially.
Since S illum and D cam make long-term constant deviation to the ideal
background C , we can integrate these components into the ideal background as
C
D cam . An intuitive explanation of this integration is that if the
illumination has changed or camera has been moved, it is reasonable for us to think
that the background (ideal background) has changed. So Eq. ( 8.3 ) can be further
expressed as:
=
C
+
S illum +
C +
V obsv =
N sys +
M obj +
M bgd
(8.4)
Thus far the observation value V obsv can be modeled as the sum of the ideal back-
ground value C and the effect components ( N sys , M obj , M bgd ). These effect compo-
nents will cause different influence on C .
N sys takes place over the whole video stream and cause modest deviation to C .So
most of the observed values will not deviate far from C .
M obj and M bgd happen only occasionally and may cause great deviation to C . So,
only a minority of the observed values will be different from C dramatically.
The observation is that the pixel values of a spatial location should keep stable
with modest deviation for the most of the time (due to long-term random deviation
N sys ) and significant deviation (due to short-term deviation M obj and M bgd )may
occur only when a moving object passes this location. So the extreme values with
significant deviations only form a minority of the observed values in a time period.
Our task is to find an estimation C of the ideal background C . From the analysis
above, we can see that C should be the center of the region where the majority of the
observed values are located. This task can be accomplished by mean shift procedure.
Here, we call the C as most reliable background model.
From the Eq. ( 8.2 ), the background frame suitable for object detection should be
constantly updated utilizing the original input frame. However, that is not feasible in
surveillance video coding system, because each updated background frame should
be encoded into stream again to guarantee the decoding match. To avoid the burst bit
rate increase caused by updating the background frame, surveillance video coding
framework updates the background frame once in a long period (namely LGOP),
and frames in the next LGOP utilizes the last reconstructed background frame as
prediction reference. Consequently, the best background frame B g in surveillance
video coding for the n frames in the next LGOP satisfies:
 
Search WWH ::




Custom Search