Databases Reference
In-Depth Information
Example3.2.5:
For the source described in the previous example, we will generate a codeword for every two
symbols, instead of generating a codeword for every symbol. If we look at the source sequence
two at a time, the number of possible symbol pairs, or size of the extended alphabet, is 3 2
9.
The extended alphabet, probability model, and Huffman code for this example are shown in
Table 3.13 .
=
T A B L E 3 . 13
The extended alphabet and
corresponding Huffman code
Letter
Probability
Code
a 1 a 1
0 . 64
0
a 1 a 2
0 . 016
10101
a 1 a 3
0 . 144
11
a 2 a 1
0 . 016
101000
a 2 a 2
0 . 0004
10100101
a 2 a 3
0 . 0036
1010011
a 3 a 1
0 . 1440
100
a 3 a 2
0 . 0036
10100100
.
a 3 a 3
0
0324
1011
The average codeword length for this extended code is 1.7228 bits/symbol. However,
each symbol in the extended alphabet corresponds to two symbols from the original alphabet.
Therefore, in terms of the original alphabet, the average codeword length is 1
8614
bits/symbol. This redundancy is about 0.045 bits/symbol, which is only about 5.5% of the
entropy.
.
7228
/
2
=
0
.
We see that by coding blocks of symbols together, we can reduce the redundancy of
Huffman codes. In the previous example, two symbols were blocked together to obtain a rate
reasonably close to the entropy. Blocking two symbols together means the alphabet size goes
from m to m 2 , where m was the size of the initial alphabet. In this case, m was three, so the size
of the extended alphabet was nine. This size is not an excessive burden for most applications.
However, if the probabilities of the symbols were more unbalanced, then it would require
blocking many more symbols together before the redundancy was lowered to acceptable levels.
As we block more and more symbols together, the size of the alphabet grows exponentially, and
the Huffman coding scheme becomes impractical. Under these conditions, we need to look at
techniques other than Huffman coding. One approach that is very useful in these conditions
is arithmetic coding . We will discuss this technique in some detail in the next chapter.
3.2.7 Implementation of Huffman Codes
Huffman encoding is relatively simple to implement using a table lookup. Decoding is another
matter. If speed is the only factor, we can implement the decoder using table lookup as well.
However, for the decoder, the table has to be of size 2 N where N is the length of the longest
 
Search WWH ::




Custom Search