Java Reference
In-Depth Information
18 bits, more space than required by the corresponding fixed-length coding. The
problem is that “MUCK” is composed of letters that are not expected to occur
often. If the message does not match the expected frequencies of the letters, than
the length of the encoding will not be as expected either.
5.6.3
Search in Human Trees
When we decode a character using the Huffman coding tree, we follow a path
through the tree dictated by the bits in the code string. Each '0' bit indicates a left
branch while each '1' bit indicates a right branch. Now look at Figure 5.26 and
consider this structure in terms of searching for a given letter (whose key value is
its Huffman code). We see that all letters with codes beginning with '0' are stored
in the left branch, while all letters with codes beginning with '1' are stored in the
right branch. Contrast this with storing records in a BST. There, all records with
key value less than the root value are stored in the left branch, while all records
with key values greater than the root are stored in the right branch.
If we view all records stored in either of these structures as appearing at some
point on a number line representing the key space, we can see that the splitting
behavior of these two structures is very different. The BST splits the space based
on the key values as they are encountered when going down the tree. But the splits
in the key space are predetermined for the Huffman tree. Search tree structures
whose splitting points in the key space are predetermined are given the special
name trie to distinguish them from the type of search tree (like the BST) whose
splitting points are determined by the data. Tries are discussed in more detail in
Chapter 13.
5.7
Further Reading
See Shaffer and Brown [SB93] for an example of a tree implementation where an
internal node pointer field stores the value of its child instead of a pointer to its
child when the child is a leaf node.
Many techniques exist for maintaining reasonably balanced BSTs in the face of
an unfriendly series of insert and delete operations. One example is the AVL tree of
Adelson-Velskii and Landis, which is discussed by Knuth [Knu98]. The AVL tree
(see Section 13.2) is actually a BST whose insert and delete routines reorganize the
tree structure so as to guarantee that the subtrees rooted by the children of any node
will differ in height by at most one. Another example is the splay tree [ST85], also
discussed in Section 13.2.
See Bentley's Programming Pearl “Thanks, Heaps” [Ben85, Ben88] for a good
discussion on the heap data structure and its uses.
The proof of Section 5.6.1 that the Huffman coding tree has minimum external
path weight is from Knuth [Knu97]. For more information on data compression
Search WWH ::




Custom Search