Binary Trees - Data Structures and Algorithm Analysis

Java Reference

In-Depth Information

2 2P+ n 2 (p+d) units of space. If P = D, then the overhead is 3P=(3P+D) = 3=4.

It might seem counter-intuitive that the overhead ratio has gone up while the total

amount of space has gone down. The reason is because we have changed our defini-

tion of “data” to refer only to what is stored in the leaf nodes, so while the overhead

fraction is higher, it is from a total storage requirement that is lower.

There is one serious flaw with this analysis. When using separate implemen-

tations for internal and leaf nodes, there must be a way to distinguish between

the node types. When separate node types are implemented via Java subclasses,

the runtime environment stores information with each object allowing it to deter-

mine, for example, the correct subclass to use when the isLeaf virtual function is

called. Thus, each node requires additional space. Only one bit is truly necessary

to distinguish the two possibilities. In rare applications where space is a critical

resource, implementors can often find a spare bit within the node's value field in

which to store the node type indicator. An alternative is to use a spare bit within

a node pointer to indicate node type. For example, this is often possible when the

compiler requires that structures and objects start on word boundaries, leaving the

last bit of a pointer value always zero. Thus, this bit can be used to store the node-

type flag and is reset to zero before the pointer is dereferenced. Another alternative

when the leaf value field is smaller than a pointer is to replace the pointer to a leaf

with that leaf's value. When space is limited, such techniques can make the differ-

ence between success and failure. In any other situation, such “bit packing” tricks

should be avoided because they are difficult to debug and understand at best, and

are often machine dependent at worst. 2

n

5.3.3

Array Implementation for Complete Binary Trees

The previous section points out that a large fraction of the space in a typical binary

tree node implementation is devoted to structural overhead, not to storing data.

This section presents a simple, compact implementation for complete binary trees.

Recall that complete binary trees have all levels except the bottom filled out com-

pletely, and the bottom level has all of its nodes filled in from left to right. Thus,

a complete binary tree of n nodes has only one possible shape. You might think

that a complete binary tree is such an unusual occurrence that there is no reason

to develop a special implementation for it. However, the complete binary tree has

practical uses, the most important being the heap data structure discussed in Sec-

tion 5.5. Heaps are often used to implement priority queues (Section 5.5) and for

external sorting algorithms (Section 8.5.2).

2 In the early to mid 1980s, I worked on a Geographic Information System that stored spatial data

in quadtrees (see Section 13.3). At the time space was a critical resource, so we used a bit-packing

approach where we stored the nodetype flag as the last bit in the parent node's pointer. This worked

perfectly on various 32-bit workstations. Unfortunately, in those days IBM PC-compatibles used

16-bit pointers. We never did figure out how to port our code to the 16-bit machine.

Search WWH ::

Custom Search

Home