Java Reference
In-Depth Information
5.6.3 interpolation search
The binary search is very fast at searching a sorted static array. In fact, it is so fast
that we would rarely use anything else. A static searching method that is sometimes
faster, however, is an interpolation search , which has better Big-Oh performance
on average than binary search but has limited practicality and a bad worst case. For
an interpolation search to be practical, two assumptions must be satisfied:
1.
Each access must be very expensive compared to a typical instruc-
tion. For example, the array might be on a disk instead of in memory,
and each comparison requires a disk access.
2.
The data must not only be sorted, it must also be fairly uniformly distrib-
uted. For example, a phone book is fairly uniformly distributed. If the
input items are {1, 2, 4, 8, 16,
}, the distribution is not uniform.
These assumptions are quite restrictive, so you might never use an inter-
polation search. But it is interesting to see that there is more than one way to
solve a problem and that no algorithm, not even the classic binary search, is
the best in all situations.
The interpolation search requires that we spend more time to make an
accurate guess regarding where the item might be. The binary search always
uses the midpoint. However, searching for Hank Aaron in the middle of the
phone book would be silly; somewhere near the start clearly would be more
appropriate. Thus, instead of mid , we use next to indicate the next item that we
will try to access.
Here's an example of what might work well. Suppose that the range con-
tains 1,000 items, the low item in the range is 1,000, the high item in the range
is 1,000,000, and we are searching for an item of value 12,000. If the items
are uniformly distributed, then we expect to find a match somewhere near the
twelfth item. The applicable formula is
-
a high
x
a low
[
]
next
=
low
+
×
(
high
-
low
-
1
)
-------------------------------------------
[
]
-
a low
[
]
The subtraction of 1 is a technical adjustment that has been shown to perform well
in practice. Clearly, this calculation is more costly than the binary search calcula-
tion. It involves an extra division (the division by 2 in the binary search is really
just a bit shift, just as dividing by 10 is easy for humans), multiplication, and four
subtractions. These calculations need to be done using floating-point operations.
One iteration may be slower than the complete binary search. However, if the cost
of these calculations is insignificant when compared to the cost of accessing an
item, speed is immaterial; we care only about the number of iterations.
 
Search WWH ::




Custom Search