Balanced Search Trees - Data Structures and Abstractions with Java

Java Reference

In-Depth Information

Java Class Library: The Class TreeMap

27.39

The package java.util contains the class TreeMap<K, V> . This class uses a red-black tree to

implement the methods in the interface SortedMap<K, V> in the same package. SortedMap

extends the interface Map<K, V> , which we described in Segment 19.22 of Chapter 19. Recall

that the interface Map is similar to our interface for the ADT dictionary. SortedMap specifies a

sorted dictionary in which the search keys are maintained in ascending order. Because

TreeMap uses a red-black tree, methods such as get , put , remove , and containsKey are each

an O(log n ) operation.

B-Trees

27.40

A multiway search tree of order m —or sometimes an m -way search tree —is a general tree

whose nodes have up to m children each. A node that has k - 1 data items and k children is called a

k -node . An order m multiway search tree can contain k -nodes for values of k ranging from 2 to m .

A binary search tree is a multiway search tree of order 2. You know that not all binary search

trees are balanced; likewise, not all multiway search trees are balanced. However, 2-3 trees and 2-4

trees are balanced multiway search trees of orders 3 and 4, respectively. We maintained the balance

of a 2-3 tree, for example, by insisting that every interior node have two or three children and that

all leaves occur on the same level.

A B-tree of order m is a balanced multiway search tree of order m that has the following

additional properties to maintain its balance:

•

The root has either no children or between 2 and m children.

•

Other interior nodes (nonleaves) have between

m

and m children each.

⁄

2

•

All leaves are on the same level.

2-3 and 2-4 trees satisfy these constraints, and so are examples of B-trees.

27.41

The search trees that you have seen so far maintain their data within the main memory of a

computer. At some point, we probably will save this data in external memory, such as a disk. As

long as we can read the data back into internal memory, we can use any of the previous search trees.

But what happens when your database becomes too large to be retained entirely within internal

memory? Typically, you use a B-tree.

Accessing data in external memory is much slower than accessing data in main memory. When

reading external data, the major cost is locating it on the storage device. Data on a disk, for exam-

ple, is organized sequentially into blocks , whose size depends on the physical characteristics of the

disk. When you read data from a disk, an entire block is read. Locating the block takes much more

time than reading the data. If each block contains the data for at least one node, you can reduce the

access time by placing numerous data items in each node. Although many comparisons per node

could be necessary, their cost is much less than the cost of accessing external data.

Since increasing the number of data items per node decreases the tree's height, you decrease

the number of nodes that you must search and hence the number of disk accesses. A high-order

B-tree fits these requirements. You would choose the order m so that m - 1 data items fit into a

block on the disk.

Note: Although a high-order B-tree is usually counterproductive for an internal database

because the number of comparisons per node increases, it is attractive when it is maintained

in external storage such as a disk.

Data Structures and Abstractions with Java

Search WWH ::

Custom Search

Home