Hash tables - Data Structures and Problem Solving Using Java

Java Reference

In-Depth Information

Theorem 20.5 shows that we can compute the next position to probe by

using an addition (to increment i ), a bit shift (to evaluate 2 i ), a subtraction by 1

(to evaluate 2 i - 1), another addition (to increment the old position by 2 i - 1), a

test to determine whether wraparound is needed, and a very rare subtraction to

implement the mod operation. The difference is thus a bit shift, a subtraction

by 1, and an addition per probe. The cost of this operation is likely to be less

than the cost of doing an extra probe if complex keys (such as strings) are

involved.

The final detail to consider is dynamic expansion. If the load factor

exceeds 0.5, we want to double the size of the hash table. This approach raises

a few issues. First, how hard will it be to find another prime number? The

answer is that prime numbers are easy to find. We expect to have to test only

O (log N ) numbers until we find a number that is prime. Consequently, the

routine shown in Figure 20.8 is very fast. The primality test takes at most

O ( N 1/2 ) time, so the search for a prime number takes at most O ( N 1/2 log N )

time. 2 This cost is much less than the O ( N ) cost of transferring the contents of

the old table to the new.

Once we have allocated a larger array, do we just copy everything over?

The answer is most definitely no. The new array implies a new hash function,

so we cannot use the old array positions. Thus we have to examine each ele-

ment in the old table, compute its new hash value, and insert it in the new hash

table. This process is called rehashing . Rehashing is easily implemented in

Java.

Expand the table as

soon as the load

factor reaches 0.5,

which is called

rehashing . Always

double to a prime

number. Prime

numbers are easy

to find.

When expanding a

hash table, reinsert

in the new table by

using the new hash

function.

20.4.1 java implementation

We are now ready to give a complete Java implementation of a quadratic

probing hash table. We will do so by implementing most of HashSet and HashMap

from the Collections API. Recall that HashSet and HashMap both require a

hashCode method. hashCode has no tableSize parameter; the hash table algorithms

perform a final mod operation internally after using the user-supplied hash

function. The class skeleton for HashSet is shown in Figure 20.9. For the algo-

rithms to work correctly, equals and hashCode must be consistent. That is, if

two objects are equal, their hash values must be equal.

The hash table consists of an array of HashEntry references. Each HashEntry

reference is either null or an object that stores an item and a data member that

tells us that the entry is either active or deleted. Because arrays of generic

The user must pro-

vide an appropriate

hashCode method

for objects.

2. This routine is also required if we add a constructor that allows the user to specify an

approximate initial size for the hash table. The hash table implementation must ensure that a

prime number is used.

Data Structures and Problem Solving Using Java

Search WWH ::

Custom Search

Home