Algorithm analysis - Data Structures and Problem Solving Using Java

Java Reference

In-Depth Information

5.6.3 interpolation search

The binary search is very fast at searching a sorted static array. In fact, it is so fast

that we would rarely use anything else. A static searching method that is sometimes

faster, however, is an interpolation search , which has better Big-Oh performance

on average than binary search but has limited practicality and a bad worst case. For

an interpolation search to be practical, two assumptions must be satisfied:

1.

Each access must be very expensive compared to a typical instruc-

tion. For example, the array might be on a disk instead of in memory,

and each comparison requires a disk access.

2.

The data must not only be sorted, it must also be fairly uniformly distrib-

uted. For example, a phone book is fairly uniformly distributed. If the

input items are {1, 2, 4, 8, 16,

…

}, the distribution is not uniform.

These assumptions are quite restrictive, so you might never use an inter-

polation search. But it is interesting to see that there is more than one way to

solve a problem and that no algorithm, not even the classic binary search, is

the best in all situations.

The interpolation search requires that we spend more time to make an

accurate guess regarding where the item might be. The binary search always

uses the midpoint. However, searching for Hank Aaron in the middle of the

phone book would be silly; somewhere near the start clearly would be more

appropriate. Thus, instead of mid , we use next to indicate the next item that we

will try to access.

Here's an example of what might work well. Suppose that the range con-

tains 1,000 items, the low item in the range is 1,000, the high item in the range

is 1,000,000, and we are searching for an item of value 12,000. If the items

are uniformly distributed, then we expect to find a match somewhere near the

twelfth item. The applicable formula is

-

a high

x

a low

[

]

low

+

×

(

high

-

low

-

1

)

-------------------------------------------

[

]

-

a low

[

]

The subtraction of 1 is a technical adjustment that has been shown to perform well

in practice. Clearly, this calculation is more costly than the binary search calcula-

tion. It involves an extra division (the division by 2 in the binary search is really

just a bit shift, just as dividing by 10 is easy for humans), multiplication, and four

subtractions. These calculations need to be done using floating-point operations.

One iteration may be slower than the complete binary search. However, if the cost

of these calculations is insignificant when compared to the cost of accessing an

item, speed is immaterial; we care only about the number of iterations.

Data Structures and Problem Solving Using Java

Search WWH ::

Custom Search

Home