Databases Reference
In-Depth Information
a 2 (0.4)
a 2 (0.4)
a 2 (0.4)
a 3 (0.6)
"
a 1 (0.2)
a 1 (0.2)
a 3 (0.4)
'
a 2 (0.4)
0
a 3 (0.2)
a 3 (0.2)
0
a 1 (0.2)
1
a 4 (0.1)
0
a 4 (0.2)
'
1
a 5 (0.1)
1
F I GU R E 3 . 1
The Huffman encoding procedure. The symbol probabilities are listed
in parentheses.
A measure of the efficiency of this code is its redundancy —the difference between the entropy
and the average length. In this case, the redundancy is 0.078 bits/symbol. The redundancy is
zero when the probabilities are negative powers of two.
An alternative way of building a Huffman code is to use the fact that the Huffman code, by
virtue of being a prefix code, can be represented as a binary tree in which the external nodes
or leaves correspond to the symbols. The Huffman code for any symbol can be obtained by
traversing the tree from the root node to the leaf corresponding to the symbol, adding a 0 to
the codeword every time the traversal takes us over an upper branch and a 1 every time the
traversal takes us over a lower branch.
We build the binary tree starting at the leaf nodes. We know that the codewords for the
two symbols with smallest probabilities are identical except for the last bit. This means that
the traversal from the root to the leaves corresponding to these two symbols must be the same
except for the last step. This in turn means that the leaves corresponding to the two symbols
with the lowest probabilities are offspring of the same node. Once we have connected the
leaves corresponding to the symbols with the lowest probabilities to a single node, we treat
this node as a symbol of a reduced alphabet. The probability of this symbol is the sum of
the probabilities of its offspring. We can now sort the nodes corresponding to the reduced
alphabet and apply the same rule to generate a parent node for the nodes corresponding to the
two symbols in the reduced alphabet with lowest probabilities. Continuing in this manner, we
end up with a single node, which is the root node. To obtain the code for each symbol, we
traverse the tree from the root to each leaf node, assigning a 0 to the upper branch and a 1 to
the lower branch. This procedure, as applied to the alphabet of Example 3.2.1 ,isshownin
Figure 3.2 . Notice the similarity between Figures 3.1 and 3.2 . This is not surprising, as they
are a result of viewing the same procedure in two different ways.
3.2.1 Minimum Variance Huffman Codes
By performing the sorting procedure in a slightly different manner, we could have found a
different Huffman code. In the first re-sort, we could place a 4 higher in the list, as shown in
Table 3.6 .
Search WWH ::




Custom Search