Huffman Coding - Introduction to Data Compression

Databases Reference

In-Depth Information

If we use the minimum variance code (Table 3.9 ), the lengths of the codewords are

{

}

. Substituting these values into the left-hand side of Equation ( 2 ), we get

2 − 2

2 − 3

which again satisfies the inequality.

The second part of this result, due to Kraft, states that if we have a sequence of positive

integers

1 that satisfies ( 2 ), then there exists a uniquely decodable code whose codeword

lengths are given by the sequence

{

l i }

1 .

Using this result, we will now show the following:

{

l i }

1. The average codeword length l of an optimal code for a source

is greater than or equal

(S)

to H

2. The average codeword length l of an optimal code for a source

is strictly less than

(S) +

For a source

with alphabet

A ={

a 1 ,

a 2 ,...

a K }

, and probability model

{

(

a 1 ),

(

a 2 ),...,

(

a K ) }

, the average codeword length is given by

(

a i )

l i

Therefore, we can write the difference between the entropy of the source H

(S)

and the average

length as

(S) −

=−

(

a i )

(

a i ) −

(

a i )

log 2 P

l i

log 2

l i

(

a i )

−

(

a i )

log 2

2 l i

(

a i )

−

log 2 [

]

(

a i )

log 2 2 − l i

(

a i )

(

a i )

2 − l i

≤

log 2

The last inequality is obtained using Jensen's inequality, which states that if f

(

)

is a concave

(convex cap, convex

∩

) function, then E

[

(

) ]

(

[

] )

. The log function is a concave

function.

Introduction to Data Compression

Search WWH ::

Custom Search

Home