Information Technology Reference
In-Depth Information
PSNR of reconstructed video by up to 0.58 dB [8]. Tu et al. modeled a more accurate
rate and distortion functions[9]. The newest scalable video coding specification
H.264/SVC, its reference software Joint Scalable Video Model (JSVM) also adopts a
JVT-G012-like rate control scheme for its base layer [10]. Yin et al. proposed an
optimum bit allocation scheme to improve the rate control accuracy [11], though its
complexity factor, simply determined by the encoding frame and its previous one
frame, failed to represent the frame complexities over a GOP. It also little consider
the different importance of each P frame in a GOP. Many MBs in the subsequent
frame after scene change may need to be encoded in intra-mode and need more bits or
else it may cause a serious degradation in picture quality.
In this paper, we first define a reasonable factor to describe frame complexity and
importance (CI). Then, according to the CI of each frame, an adaptive allocation tar-
get bits and buffer strategy among different frames is presented to improve the quality
of frames especially for high motions or scene changes. The organization of the paper
is as follows. Section 2 briefly introduces preliminary knowledge for later section. In
Section 3, a CI-based rate-control method is proposed. For demonstrating the effec-
tiveness of the proposed scheme, and the experimental results are provided in Section
4. Section 5 concludes the paper.
2 Analysis of Frame Layer Rate Control in JVT-G012
In JVT-G012, QPs of I frame and the first P frame in a group-of-pictures (GOP) are
calculated based on available channel bandwidth and GOP length. All the remaining
forward predicted pictures (P frames) are calculated based on a target bit for each
frame and RDO process for the current frame. All bi-directional predicted pictures
(B frames) are obtained through a linear interpolation method according to QP of P
frames. It is quite important to accurately estimate target bits for the current P frame.
In this section, we will review the method used for estimating the target bits in
JVT-G012 and analyse the limitation of the existing method.
A fluid traffic model based on the linear tracking theory is employed to estimate
target bits for the current P frame [5]. For simplicity, assume a GOP is encoded with
IPPP prediction structure. Let N denote total number of frames in a GOP, n j is the j th
frame in a GOP, u ( n j ) denote available channel bandwidth, T r ( n j ) be the number of
remaining bits before encoding the current frame, B c ( n j ) denote the occupancy of
virtual buffer after coding current frame and A ( n j ) be the actual of bits generated after
encoding a frame. To estimate target bits for the current P frame the fluid traffic mod-
el is used to update T r frame by frame as follows
un
() ( )
un
j
j
1
,
(1)
Tn
()
=
Tn
( )
+
(
N j
)
An
( )
r
j
r
j
1
j
1
F
r
Tn be the number of remaining bits after encoding last frame. Mean-
while, the target buffer level Tbl for each frame is updated frame by frame as follows
where
(
)
r
j
1
un
()
Tbl n
()
- B /8
(2)
j
Deltp =
2
s
,
Tbl n
(
)
= Tbl n
(
)
- Deltp -
.
j
j
1
N-
1
F
p
r
Then linear tracking theory is employed to determine the target bits allocated for the
j th frame as follows
 
Search WWH ::




Custom Search