Game Development Reference
In-Depth Information
Table 10.1 The bitrate reduction and time saving of the model-based HEVC over the HEVC test
model HM with low-delay configurations
SD (%)
720p (%)
1
,
600
×
1
,
200
1080p (%)
AVG ( % )
Bitrate reduction
45.40
53.47
45.43
39.24
45.89
Time saving
49.17
63.64
45.94
24.68
45.86
and at the same time reducing the encoding complexity. Based on experimental results
on PKU-SVD-A surveillance video in Table 10.1 , the techniques in IEEE-1857 can
averagely save 45.89% bits and 45.86% encoding time compared with HEVC.
Recognition-friendly schema
Often, surveillance video is captured not for better viewing experience, but for analy-
sis and recognition of special objects or events. Traditionally, the principal objective
of video coding is to improve the compression rate of the whole frame while not
distinguishing the background or ROIs from the frame. However, one of the most
important prerequisites in video analysis and recognition is the relatively high fidelity
(or clarity) of ROIs that often contain moving objects (e.g., pedestrians, vehicles) or
faces. Taken in this sense, IEEE 1857 surveillance groups have done a useful attempt
to integrate the two seemingly contradictory subfields into a unified framework: On
one hand, they utilize the background modeling in the coding loop to remove the
“scenic redundancy” in consecutive frames, consequently achieving a higher coding
efficiency; on the other hand, they should be the first technology proposed in a video
coding standard that provides the best supports for video analysis and recognition,
including the syntax for describing ROI regions and camera parameters, high-clarity
ROI coding, and object detection based on MB classification.
In IEEE 1857 surveillance groups, ROIs of attended objects can be coded into the
coded bitstream and thus can be directly extracted for further analysis and recogni-
tion tasks. Figure 10.5 a shows the conceptual schema of the standard syntax, which
basically supports three levels of semantic descriptions of a video sequence, ranging
from ROIs to objects and events. As an example, Fig. 10.5 b illustrates the syntax
used to describe ROIs in the coded bitstream. These semantic descriptions, whatever
manually labeled or automatically extracted, can be used in video analysis tasks such
as object detection and tracking (as shown in Fig. 10.5 c). In addition, the standard
syntax also supports the description of the parameters and positioning information
(e.g., GPS) of the camera in the video stream. Usually, these data can facilitate many
challenging visual tasks such as camera calibration, object tracking across cameras
in a wide area. Instead of utilizing some ROI-based resource allocation algorithms
(Duong et al. 2005 ) to reach higher compression ratios without perceptually degrad-
ing the reconstructedROIs inAVC/H.264, IEEE1857 surveillance groups themselves
can support the high-clarity coding for ROIs. That is, they can allocate more bits to
the ROIs in the coded stream without remarkably increasing the overall bitrate. With
background modeling in the coding loop, the ROI detection and ROI-based bit allo-
cation can be done automatically in the coding process. In this way, IEEE 1857
 
Search WWH ::




Custom Search