Java Reference
In-Depth Information
Theorem 20.5 shows that we can compute the next position to probe by
using an addition (to increment i ), a bit shift (to evaluate 2 i ), a subtraction by 1
(to evaluate 2 i - 1), another addition (to increment the old position by 2 i - 1), a
test to determine whether wraparound is needed, and a very rare subtraction to
implement the mod operation. The difference is thus a bit shift, a subtraction
by 1, and an addition per probe. The cost of this operation is likely to be less
than the cost of doing an extra probe if complex keys (such as strings) are
involved.
The final detail to consider is dynamic expansion. If the load factor
exceeds 0.5, we want to double the size of the hash table. This approach raises
a few issues. First, how hard will it be to find another prime number? The
answer is that prime numbers are easy to find. We expect to have to test only
O (log N ) numbers until we find a number that is prime. Consequently, the
routine shown in Figure 20.8 is very fast. The primality test takes at most
O ( N 1/2 ) time, so the search for a prime number takes at most O ( N 1/2 log N )
time. 2 This cost is much less than the O ( N ) cost of transferring the contents of
the old table to the new.
Once we have allocated a larger array, do we just copy everything over?
The answer is most definitely no. The new array implies a new hash function,
so we cannot use the old array positions. Thus we have to examine each ele-
ment in the old table, compute its new hash value, and insert it in the new hash
table. This process is called rehashing . Rehashing is easily implemented in
Java.
Expand the table as
soon as the load
factor reaches 0.5,
which is called
rehashing . Always
double to a prime
number. Prime
numbers are easy
to find.
When expanding a
hash table, reinsert
in the new table by
using the new hash
function.
20.4.1 java implementation
We are now ready to give a complete Java implementation of a quadratic
probing hash table. We will do so by implementing most of HashSet and HashMap
from the Collections API. Recall that HashSet and HashMap both require a
hashCode method. hashCode has no tableSize parameter; the hash table algorithms
perform a final mod operation internally after using the user-supplied hash
function. The class skeleton for HashSet is shown in Figure 20.9. For the algo-
rithms to work correctly, equals and hashCode must be consistent. That is, if
two objects are equal, their hash values must be equal.
The hash table consists of an array of HashEntry references. Each HashEntry
reference is either null or an object that stores an item and a data member that
tells us that the entry is either active or deleted. Because arrays of generic
The user must pro-
vide an appropriate
hashCode method
for objects.
2. This routine is also required if we add a constructor that allows the user to specify an
approximate initial size for the hash table. The hash table implementation must ensure that a
prime number is used.
 
 
Search WWH ::




Custom Search