Databases Reference
In-Depth Information
As the code is an optimal code i = 1 2 l i
1, therefore
0 (3)
We will prove the upper bound by showing that there exists a uniquely decodable code with
average codeword length less than H
H
(S)
l
1. Therefore, if we have an optimal code, this code
must have an average length that is less than H
(S) +
1.
Given a source, alphabet, and probability model as before, define
(S) +
log 2
1
l i
=
P
(
a i )
where
x
is the smallest integer greater than or equal to x . For example,
3
.
3
=
4 and
5
=
5. Thus,
x
=
x
+
where 0
<
1
Therefore,
1
1
log 2
a i )
l i
<
log 2
a i ) +
1
(4)
P
(
P
(
From the left inequality of ( 4 ), we can see that
2 l i
P
(
a i )
Accordingly,
K
K
2 l i
P
(
a i ) =
1
i
=
1
i
=
1
and by the Kraft-McMillan inequality, there exists a uniquely decodable code with codeword
lengths
{
l i }
. The average length of this code can be upper-bounded by using the right inequality
of ( 4 ):
log 2
1
K
K
1
l
=
P
(
a i )
l i
<
P
(
a i )
a i ) +
P
(
i
=
1
i
=
1
or
1
We can see from the way the upper boundwas derived that this is a rather loose upper bound.
In fact, it can be shown that if p max is the largest probability in the probability model, then for
p max
l
<
H
(S) +
0
.
5, the upper bound for the Huffman code is H
(S) +
p max , while for p max <
0
.
5, the
upper bound is H
086. Obviously, this is a much tighter bound than the one
we derived above. The derivation of this bound takes some time (see [ 21 ] for details).
(S) +
p max +
0
.
3.2.6 Extended Huffman Codes
In applications where the alphabet size is large, p max is generally quite small, and the amount
of deviation from the entropy, especially in terms of a percentage of the rate, is quite small.
However, in cases where the alphabet is small and the probability of occurrence of the different
letters is skewed, the value of p max can be quite large; and the Huffman code can become rather
inefficient when compared to the entropy.
 
Search WWH ::




Custom Search