Java Reference
In-Depth Information
is p(K;i) = (i 2 + i)=2, then every slot in the table will be visited by the probe
function.
Both pseudo-random probing and quadratic probing eliminate primary cluster-
ing, which is the problem of keys sharing substantial segments of a probe sequence.
If two keys hash to the same home position, however, then they will always follow
the same probe sequence for every collision resolution method that we have seen so
far. The probe sequences generated by pseudo-random and quadratic probing (for
example) are entirely a function of the home position, not the original key value.
This is because function p ignores its input parameter K for these collision resolu-
tion methods. If the hash function generates a cluster at a particular home position,
then the cluster remains under pseudo-random and quadratic probing. This problem
is called secondary clustering.
To avoid secondary clustering, we need to have the probe sequence make use of
the original key value in its decision-making process. A simple technique for doing
this is to return to linear probing by a constant step size for the probe function, but
to have that constant be determined by a second hash function, h 2 . Thus, the probe
sequence would be of the form p(K;i) = ih 2 (K). This method is called double
hashing.
Example9.11 Assume a hash table has size M = 101, and that there
are three keys k 1 , k 2 , and k 3 with h(k 1 ) = 30, h(k 2 ) = 28, h(k 3 ) = 30,
h 2 (k 1 ) = 2, h 2 (k 2 ) = 5, and h 2 (k 3 ) = 5. Then, the probe sequence
for k 1 will be 30, 32, 34, 36, and so on. The probe sequence for k 2 will
be 28, 33, 38, 43, and so on. The probe sequence for k 3 will be 30, 35,
40, 45, and so on. Thus, none of the keys share substantial portions of the
same probe sequence. Of course, if a fourth key k 4 has h(k 4 ) = 28 and
h 2 (k 4 ) = 2, then it will follow the same probe sequence as k 1 . Pseudo-
random or quadratic probing can be combined with double hashing to solve
this problem.
A good implementation of double hashing should ensure that all of the probe
sequence constants are relatively prime to the table size M. This can be achieved
easily. One way is to select M to be a prime number, and have h 2 return a value in
the range 1 h 2 (K) M 1. Another way is to set M = 2 m for some value m
and have h 2 return an odd value between 1 and 2 m .
Figure 9.9 shows an implementation of the dictionary ADT by means of a hash
table. The simplest hash function is used, with collision resolution by linear prob-
ing, as the basis for the structure of a hash table implementation. A suggested
project at the end of this chapter asks you to improve the implementation with
other hash functions and collision resolution policies.
 
Search WWH ::




Custom Search