Advanced Tree Structures - Data Structures and Algorithm Analysis

Java Reference

In-Depth Information

with keys in the upper half of the key range will be stored in the right subtree.

While such a decomposition rule will not necessarily result in a balanced tree (the

tree will be unbalanced if the records are not well distributed within the key range),

at least the shape of the tree will not depend on the order of key insertion. Further-

more, the depth of the tree will be limited by the resolution of the key range; that

is, the depth of the tree can never be greater than the number of bits required to

store a key value. For example, if the keys are integers in the range 0 to 1023, then

the resolution for the key is ten bits. Thus, two keys can be identical only until the

tenth bit. In the worst case, two keys will follow the same path in the tree only until

the tenth branch. As a result, the tree will never be more than ten levels deep. In

contrast, a BST containing n records could be as much as n levels deep.

Splitting based on predetermined subdivisions of the key range is called key

space decomposition. In computer graphics, the technique is known as image

space decomposition, and this term is sometimes used to describe the process for

data structures as well. A data structure based on key space decomposition is called

a trie. Folklore has it that “trie” comes from “retrieval.” Unfortunately, that would

imply that the word is pronounced “tree,” which would lead to confusion with reg-

ular use of the word “tree.” “Trie” is actually pronounced as “try.”

Like the B + -tree, a trie stores data records only in leaf nodes. Internal nodes

serve as placeholders to direct the search process. but since the split points are pre-

determined, internal nodes need not store “traffic-directing” key values. Figure 13.1

illustrates the trie concept. Upper and lower bounds must be imposed on the key

values so that we can compute the middle of the key range. Because the largest

value inserted in this example is 120, a range from 0 to 127 is assumed, as 128 is

the smallest power of two greater than 120. The binary value of the key determines

whether to select the left or right branch at any given point during the search. The

most significant bit determines the branch direction at the root. Figure 13.1 shows

a binary trie, so called because in this example the trie structure is based on the

value of the key interpreted as a binary number, which results in a binary tree.

The Huffman coding tree of Section 5.6 is another example of a binary trie. All

data values in the Huffman tree are at the leaves, and each branch splits the range

of possible letter codes in half. The Huffman codes are actually reconstructed from

the letter positions within the trie.

These are examples of binary tries, but tries can be built with any branching

factor. Normally the branching factor is determined by the alphabet used. For

binary numbers, the alphabet is f0, 1g and a binary trie results. Other alphabets

lead to other branching factors.

One application for tries is to store a dictionary of words. Such a trie will be

referred to as an alphabet trie. For simplicity, our examples will ignore case in

letters. We add a special character ($) to the 26 standard English letters. The $

character is used to represent the end of a string. Thus, the branching factor for

Search WWH ::

Custom Search

Home