Intelligent Video System - Advanced Video Coding Systems

Game Development Reference

In-Depth Information

Table 10.1 The bitrate reduction and time saving of the model-based HEVC over the HEVC test

model HM with low-delay configurations

SD (%)

720p (%)

1

,

600

×

1

,

200

1080p (%)

AVG ( % )

Bitrate reduction

45.40

53.47

45.43

39.24

45.89

Time saving

49.17

63.64

45.94

24.68

45.86

and at the same time reducing the encoding complexity. Based on experimental results

on PKU-SVD-A surveillance video in Table 10.1 , the techniques in IEEE-1857 can

averagely save 45.89% bits and 45.86% encoding time compared with HEVC.

Recognition-friendly schema

Often, surveillance video is captured not for better viewing experience, but for analy-

sis and recognition of special objects or events. Traditionally, the principal objective

of video coding is to improve the compression rate of the whole frame while not

distinguishing the background or ROIs from the frame. However, one of the most

important prerequisites in video analysis and recognition is the relatively high fidelity

(or clarity) of ROIs that often contain moving objects (e.g., pedestrians, vehicles) or

faces. Taken in this sense, IEEE 1857 surveillance groups have done a useful attempt

to integrate the two seemingly contradictory subfields into a unified framework: On

one hand, they utilize the background modeling in the coding loop to remove the

“scenic redundancy” in consecutive frames, consequently achieving a higher coding

efficiency; on the other hand, they should be the first technology proposed in a video

coding standard that provides the best supports for video analysis and recognition,

including the syntax for describing ROI regions and camera parameters, high-clarity

ROI coding, and object detection based on MB classification.

In IEEE 1857 surveillance groups, ROIs of attended objects can be coded into the

coded bitstream and thus can be directly extracted for further analysis and recogni-

tion tasks. Figure 10.5 a shows the conceptual schema of the standard syntax, which

basically supports three levels of semantic descriptions of a video sequence, ranging

from ROIs to objects and events. As an example, Fig. 10.5 b illustrates the syntax

used to describe ROIs in the coded bitstream. These semantic descriptions, whatever

manually labeled or automatically extracted, can be used in video analysis tasks such

as object detection and tracking (as shown in Fig. 10.5 c). In addition, the standard

syntax also supports the description of the parameters and positioning information

(e.g., GPS) of the camera in the video stream. Usually, these data can facilitate many

challenging visual tasks such as camera calibration, object tracking across cameras

in a wide area. Instead of utilizing some ROI-based resource allocation algorithms

(Duong et al. 2005 ) to reach higher compression ratios without perceptually degrad-

ing the reconstructedROIs inAVC/H.264, IEEE1857 surveillance groups themselves

can support the high-clarity coding for ROIs. That is, they can allocate more bits to

the ROIs in the coded stream without remarkably increasing the overall bitrate. With

background modeling in the coding loop, the ROI detection and ROI-based bit allo-

cation can be done automatically in the coding process. In this way, IEEE 1857

Advanced Video Coding Systems

Search WWH ::

Custom Search

Home