Splay trees - Data Structures and Problem Solving Using Java

Java Reference

In-Depth Information

self-adjustment and

amortized analysis

22.1

Although balanced search trees provide logarithmic worst-case running time

per operation, they have several limitations.

They require storing an extra piece of balancing information per node.

n

They are complicated to implement. As a result, insertions and dele-

tions are expensive and potentially error-prone.

n

They do not provide a win when easy inputs occur.

n

Let us examine the consequences of each of these deficiencies. First, bal-

anced search trees require an extra data member. Although in theory this

member can be as small as a single bit (as in a red-black tree), in practice the

extra data member uses an entire integer for storage in order to satisfy hard-

ware restrictions. Because computer memories are becoming huge, we must

ask whether worrying about memory is a large issue. The answer in most

cases is probably not, except that maintaining the extra data members requires

more complex code and tends to lead to longer running times and more errors.

Indeed, identifying whether the balancing information for a search tree is cor-

rect is difficult because errors lead only to an unbalanced tree. If one case is

slightly wrong, spotting the errors might be difficult. Thus, as a practical mat-

ter, algorithms that allow us to remove some complications without sacrific-

ing performance deserve serious consideration.

The real problem is

that the extra data

members add com-

plications.

Second, the worst-case, average-case, and best-case performances of a

balanced search are essentially identical. An example is a find operation for

some item X . We could reasonably expect that, not only the cost of the find

will be logarithmic, but also that if we perform an immediate second find for

X, the second access will be cheaper than the first. However, in a red-black

tree, this condition is not true. We would also expect that, if we perform an

access of X, Y, and Z in that order, a second set of accesses for the same

sequence would be easy. This assumption is important because of the 90-10

rule . As suggested by empirical studies, the 90-10 rule states that in practice

90 percent of the accesses are to 10 percent of the data items. Thus we want

easy wins for the 90 percent case, but balanced search trees do not take advan-

tage of this rule.

The 90-10 rule has been used for many years in disk I/O systems. A cache

stores in main memory the contents of some of the disk blocks. The hope is that

when a disk access is requested, the block can be found in the main memory

cache and thus save the cost of an expensive disk access. Of course, only rela-

tively few disk blocks can be stored in memory. Even so, storing the most recently

The 90-10 rule

states that 90 per-

cent of the

accesses are to

10 percent of the

data items. How-

ever, balanced

search trees do not

take advantage of

this rule.

Data Structures and Problem Solving Using Java

Search WWH ::

Custom Search

Home