Java Reference
In-Depth Information
with keys in the upper half of the key range will be stored in the right subtree.
While such a decomposition rule will not necessarily result in a balanced tree (the
tree will be unbalanced if the records are not well distributed within the key range),
at least the shape of the tree will not depend on the order of key insertion. Further-
more, the depth of the tree will be limited by the resolution of the key range; that
is, the depth of the tree can never be greater than the number of bits required to
store a key value. For example, if the keys are integers in the range 0 to 1023, then
the resolution for the key is ten bits. Thus, two keys can be identical only until the
tenth bit. In the worst case, two keys will follow the same path in the tree only until
the tenth branch. As a result, the tree will never be more than ten levels deep. In
contrast, a BST containing n records could be as much as n levels deep.
Splitting based on predetermined subdivisions of the key range is called key
space decomposition. In computer graphics, the technique is known as image
space decomposition, and this term is sometimes used to describe the process for
data structures as well. A data structure based on key space decomposition is called
a trie. Folklore has it that “trie” comes from “retrieval.” Unfortunately, that would
imply that the word is pronounced “tree,” which would lead to confusion with reg-
ular use of the word “tree.” “Trie” is actually pronounced as “try.”
Like the B + -tree, a trie stores data records only in leaf nodes. Internal nodes
serve as placeholders to direct the search process. but since the split points are pre-
determined, internal nodes need not store “traffic-directing” key values. Figure 13.1
illustrates the trie concept. Upper and lower bounds must be imposed on the key
values so that we can compute the middle of the key range. Because the largest
value inserted in this example is 120, a range from 0 to 127 is assumed, as 128 is
the smallest power of two greater than 120. The binary value of the key determines
whether to select the left or right branch at any given point during the search. The
most significant bit determines the branch direction at the root. Figure 13.1 shows
a binary trie, so called because in this example the trie structure is based on the
value of the key interpreted as a binary number, which results in a binary tree.
The Huffman coding tree of Section 5.6 is another example of a binary trie. All
data values in the Huffman tree are at the leaves, and each branch splits the range
of possible letter codes in half. The Huffman codes are actually reconstructed from
the letter positions within the trie.
These are examples of binary tries, but tries can be built with any branching
factor. Normally the branching factor is determined by the alphabet used. For
binary numbers, the alphabet is f0, 1g and a binary trie results. Other alphabets
lead to other branching factors.
One application for tries is to store a dictionary of words. Such a trie will be
referred to as an alphabet trie. For simplicity, our examples will ignore case in
letters. We add a special character ($) to the 26 standard English letters. The $
character is used to represent the end of a string. Thus, the branching factor for
Search WWH ::




Custom Search