Scalable H.264 Wireless Video Transmission over MIMO-OFDM Channels - Advanced Concepts for Intelligent Vision Systems - page 40

Information Technology Reference

In-Depth Information

reconstruction of frame n . In case of losing the FGS layers of the reference

pictures, only the base layers of the frames u n and v n are used as the reference

for frame n . As discussed above, in SVC the decoding of all the frames in a GOP

is done from the lowest to the highest temporal level. Similar to the Temporal-

SNR method, we will use the “true” reference frames for distortion estimation

and hence, the loss of base layer of either or both the reference frames will result

in the concealment of the frame n .The s th moment of the i th pixel of frame n

when at least the base layer is received correctly is calculated as:

E f n ( u n ,v n ) s = P uvB P nE 1 f nB u Bn v Bn

s

P nE 1 ) f n ( B,E 1) u Bn v Bn

s

+ P uvB (1

−

+ P uvB,E 1 P nE 1 f nB u ( B,E 1) n v ( B,E 1) n

s

(10)

P nE 1 ) f n ( B,E 1) u ( B,E 1) n v ( B,E 1) n

s

+ P uvB,E 1 (1

−

where, P uvB =(1

P v n B ) P u n E 1 P v n E 1 is the probability of correctly

receiving the base layers and not receiving the FGS layers of the frames u n

and v n . Similarly, P uvB,E 1 =(1

−

P u n B )(1

−

P u n E 1 )is

the probability of correctly receiving the base layers and the FGS layers of the

frames u n and v n . In case the base layer of frame n is lost, the complete frame

has to be concealed. To get the distortion per-pixel after error concealment, we

use Eq. (9).

The performance of the two SDDE methods is evaluated by comparing it with

the actual decoder distortion estimation averaged over 200 channel realizations.

Different video sequences encoded at 30 fps, GOP size of eight frames and six lay-

ers are used in packet-based video transmission simulations. Each of these layers is

considered to be affected with different loss rates P =

−

P u n B )(1

−

P v n B )(1

−

P u n E 1 )(1

−

{

P TL 0 ,P TL 1 ,P TL 2 ,P TL 3 ,

P E 1 }

,where P TLx is the probability of losing the base layer of a frame that be-

longs to TLx and P E 1 is the probability of losing FGS1 of a frame. For perfor-

mance evaluation, packet loss rates considered are P 1=

{

0% , 0% , 5% , 5% , 10%

}

and P 2=

. In Table 1, the average Peak Signal to Noise

Ratio (PSNR) performance is presented for the “Foreman”, “Akiyo” and “Car-

phone” sequences. As can be observed, both the Temporal-SNR and the SNR-

Temporal methods result in good average PSNR estimates and hence they are

used to solve the optimization problem of section 4.

{

0% , 10% , 20% , 30% , 50%

}

Table 2. Average PSNR comparison for the proposed distortion estimation algorithms

Foreman Akiyo Carphone

363 kbps 268 kbps 612 kbps

Actual P1 (dB)

36.40

45.91

40.85

Temporal-SNR SDDE (dB)

35.48

45.84

40.12

SNR-Temporal SDDE (dB)

36.00

45.43

40.35

Actual P2 (dB)

30.82

41.46

35.32

Temporal-SNR SDDE (dB)

29.80

41.20

35.10

SNR-Temporal SDDE (dB)

30.22

40.86

35.28

Next Page

Advanced Concepts for Intelligent Vision Systems

Search WWH ::

Custom Search

Home