Databases Reference
In-Depth Information
The most widely used algorithm for constructing length-limited Huffman codes is the
package-merge algorithm due to Larmore and Hirchberg [ 18 ]. The package-merge algorithm
is more general than what we describe here, our only interest being its application to designing
length-limited Huffman codes. Our description is based on the work of Turpin and Moffat
[ 19 ]. We will use a couple of facts from our prior discussions to design length-limited Huffman
codes. First, as we have seen in the previous section, all we need to obtain a Huffman code
is the length of the codewords. Second, as we showed in Chapter 2, for an alphabet of size
m the lengths of the codewords
{
l 1 ,
l 2 , ··· ,
l m }
in any variable length code have to satisfy the
Kraft-McMillan inequality
m
2 l i
1
i
=
1
and, given a set of l i that satisfy the Kraft-McMillan inequality, we can always find a uniquely
decodable code with codewords of length l i .
Suppose our alphabet consists of
{
a 1 ,
a 2 ,...,
a m }
the letters
with probabilities
{
p 1 ,
p 2 ,...,
p m }
. We will assume that the letters have been sorted based on their proba-
bilities such that p 1
p m . Our design process will involve incrementing the
lengths l i until the Kraft-McMillan inequality is satisfied while at the same time keeping the
average length l as small as we can. We start with assigning zero bits for all lengths. Clearly,
this will result in the lowest possible length of zero. However, it certainly does not give us a
uniquely decodable code. Not that we need to justify the latter point mathematically, but if we
did, we could compute the Kraft-McMillan sum as
p 2 ···
m
m
2 l i
2 0
=
=
m
i
=
1
i
=
1
which is clearly greater than one. We can decrement this sum by 0.5 by incrementing the
codeword for one of the letters to one. We want to pick the codeword that will have the
minimal impact on the average codeword length. As the letters are sorted in nondecreasing
probability order and the average codeword length is given by
m
l
=
p i l i
i
=
1
we will increase l 1 to one. This will give us an average codeword length of p 1 and a Kraft-
McMillan sum of m
5. Actually, we could have picked any of the letters as we will need to
increment each length to at least one if we are to have any hope of reducing the Kraft-McMillan
sum to one. However, once we have incremented each codeword length to one how do we
decide which codeword lengths to increment further? The package-merge algorithm is an
iterative algorithm that solves this problem by generating a list of choices that can be sorted in
order of increasing cost. Each iteration consists of two steps operating on the list, a packaging
step and a merge step. In the packaging step the list from the previous step is partitioned,
or packaged, into groups of two neighboring items. The cost of a package is the sum of the
costs of the items in the package where the cost of a letter is the probability of that letter. If
the number of items is odd the item with the highest cost is discarded. In the merge step the
0
.
Search WWH ::




Custom Search