Java Reference
In-Depth Information
20.4.2 analysis of quadratic probing
Quadratic probing
is implemented in
findPos . It uses the
previously
described trick to
avoid multiplica-
tions and mods.
Quadratic probing has not yet been mathematically analyzed, although we
know that it eliminates primary clustering. In quadratic probing, elements that
hash to the same position probe the same alternative cells, which is known as
secondary clustering. Again, the independence of successive probes cannot be
assumed. Secondary clustering is a slight theoretical blemish. Simulation
results suggest that it generally causes less than an extra one-half probe per
search and that this increase is true only for high load factors. Figure 20.6
illustrates the difference between linear probing and quadratic probing and
shows that quadratic probing does not suffer from as much clustering as does
linear probing.
Techniques that eliminate secondary clustering are available. The most
popular is double hashing, in which a second hash function is used to drive
the collision resolution. Specifically, we probe at a distance Hash 2 ( X ),
2 Hash 2 ( X ), and so on. The second hash function must be carefully chosen
(e.g., it should never evaluate to 0), and all cells must be capable of being
probed. A function such as Hash 2 ( X ) = R - ( X mod R ), with R a prime smaller
than M, generally works well. Double hashing is theoretically interesting
because it can be shown to use essentially the same number of probes as the
purely random analysis of linear probing would imply. However, it is some-
what more complicated than quadratic probing to implement and requires
careful attention to some details.
There seems to be no good reason not to use a quadratic probing strategy,
unless the overhead of maintaining a half-empty table is burdensome. That
would be the case in other programming languages if the items being stored
were very large.
In secondary clus-
tering, elements
that hash to the
same position
probe the same
alternative cells.
Secondary cluster-
ing is a minor theo-
retical blemish.
Double hashing is a
hashing technique
that does not suf-
fer from secondary
clustering. A sec-
ond hash function
is used to drive the
collision resolution.
separate chaining hashing
20.5
A popular and space-efficient alternative to quadratic probing is separate chain-
ing hashing in which an array of linked lists is maintained. For an array of
linked lists, L 0 , L 1 , ..., L M - 1 , the hash function tells us in which list to insert an
item X and then, during a find , which list contains X . The idea is that, although
searching a linked list is a linear operation, if the lists are sufficiently short, the
search time will be very fast. In particular, suppose that the load factor, N/M, is
λ
Separate chaining
hashing is a space-
efficient alternative
to quadratic prob-
ing in which an
array of linked lists
is maintained. It is
less sensitive to
high load factors.
, making the
expected number of probes for an insertion or unsuccessful search
, which is not bounded by 1.0. Thus the average list has length
λ
and the
expected number of probes for a successful search . The reason is that
a successful search must occur in a nonempty list, and in such a list we expect to
have to traverse halfway down the list. The relative cost of a successful search
λ
1
+
λ
2
 
 
Search WWH ::




Custom Search