Java Reference
In-Depth Information
5.3.2 Space Requirements
This section presents techniques for calculating the amount of overhead required by
a binary tree implementation. Recall that overhead is the amount of space necessary
to maintain the data structure. In other words, it is any space not used to store
data records. The amount of overhead depends on several factors including which
nodes store data values (all nodes, or just the leaves), whether the leaves store child
pointers, and whether the tree is a full binary tree.
In a simple pointer-based implementation for the binary tree such as that of
Figure 5.7, every node has two pointers to its children (even when the children are
null ). This implementation requires total space amounting to n(2P + D) for a
tree of n nodes. Here, P stands for the amount of space required by a pointer, and
D stands for the amount of space required by a data value. The total overhead space
will be 2Pn for the entire tree. Thus, the overhead fraction will be 2P=(2P + D).
The actual value for this expression depends on the relative size of pointers versus
data fields. If we arbitrarily assume that P = D, then a full tree has about two
thirds of its total space taken up in overhead. Worse yet, Theorem 5.2 tells us that
about half of the pointers are “wasted” null values that serve only to indicate tree
structure, but which do not provide access to new data.
In Java, the most typical implementation is not to store any actual data in a
node, but rather a reference to the data record. In this case, each node will typically
store three pointers, all of which are overhead, resulting in an overhead fraction of
3P=(3P + D).
If only leaves store data values, then the fraction of total space devoted to over-
head depends on whether the tree is full. If the tree is not full, then conceivably
there might only be one leaf node at the end of a series of internal nodes. Thus,
the overhead can be an arbitrarily high percentage for non-full binary trees. The
overhead fraction drops as the tree becomes closer to full, being lowest when the
tree is truly full. In this case, about one half of the nodes are internal.
Great savings can be had by eliminating the pointers from leaf nodes in full bi-
nary trees. Again assume the tree stores a reference to the data field. Because about
half of the nodes are leaves and half internal nodes, and because only internal nodes
now have child pointers, the overhead fraction in this case will be approximately
n
2 (2P)
P
P + D :
2 (2P) + Dn =
n
If P = D, the overhead drops to about one half of the total space. However, if only
leaf nodes store useful information, the overhead fraction for this implementation is
actually three quarters of the total space, because half of the “data” space is unused.
If a full binary tree needs to store data only at the leaf nodes, a better imple-
mentation would have the internal nodes store two pointers and no data field while
the leaf nodes store only a reference to the data field. This implementation requires
Search WWH ::




Custom Search