Java Reference
In-Depth Information
One quality of a good probe sequence is that it will cycle through all slots in
the hash table before returning to the home position. Clearly linear probing (which
“skips” slots by one each time) does this. Unfortunately, not all values for c will
make this happen. For example, if c = 2 and the table contains an even number
of slots, then any key whose home position is in an even slot will have a probe
sequence that cycles through only the even slots. Likewise, the probe sequence
for a key whose home position is in an odd slot will cycle through the odd slots.
Thus, this combination of table size and linear probing constant effectively divides
the records into two sets stored in two disjoint sections of the hash table. So long
as both sections of the table contain the same number of records, this is not really
important. However, just from chance it is likely that one section will become fuller
than the other, leading to more collisions and poorer performance for those records.
The other section would have fewer records, and thus better performance. But the
overall system performance will be degraded, as the additional cost to the side that
is more full outweighs the improved performance of the less-full side.
Constant c must be relatively prime to M to generate a linear probing sequence
that visits all slots in the table (that is, c and M must share no factors). For a hash
table of size M = 10, if c is any one of 1, 3, 7, or 9, then the probe sequence
will visit all slots for any key. When M = 11, any value for c between 1 and 10
generates a probe sequence that visits all slots for every key.
Consider the situation where c = 2 and we wish to insert a record with key k 1
such that h(k 1 ) = 3. The probe sequence for k 1 is 3, 5, 7, 9, and so on. If another
key k 2 has home position at slot 5, then its probe sequence will be 5, 7, 9, and so on.
The probe sequences of k 1 and k 2 are linked together in a manner that contributes
to clustering. In other words, linear probing with a value of c > 1 does not solve
the problem of primary clustering. We would like to find a probe function that does
not link keys together in this way. We would prefer that the probe sequence for k 1
after the first step on the sequence should not be identical to the probe sequence of
k 2 . Instead, their probe sequences should diverge.
The ideal probe function would select the next position on the probe sequence
at random from among the unvisited slots; that is, the probe sequence should be a
random permutation of the hash table positions. Unfortunately, we cannot actually
select the next position in the probe sequence at random, because then we would not
be able to duplicate this same probe sequence when searching for the key. However,
we can do something similar called pseudo-random probing. In pseudo-random
probing, the ith slot in the probe sequence is (h(K) + r i ) mod M where r i is the
ith value in a random permutation of the numbers from 1 to M 1. All insertion
and search operations use the same random permutation. The probe function is
p(K;i) = Perm [i 1];
where Perm is an array of length M 1 containing a random permutation of the
values from 1 to M 1.
Search WWH ::




Custom Search