Hash tables - Data Structures and Problem Solving Using Java

Java Reference

In-Depth Information

1 // A poor hash function when tableSize is large

2 public static int hash( String key, int tableSize )

3 {

4 int hashVal = 0;

5

6 for( int i = 0; i < key.length( ); i++ )

7 hashVal += key.charAt( i );

8

9 return hashVal % tableSize;

10 }

figure 20.3

A bad hash function if

tableSize is large

Although speed is an important consideration in designing a hash function,

we also want to be sure that it distributes the keys equitably. Consequently, we

must not take our optimizations too far. An example is the hash function shown

in Figure 20.3. It simply adds the characters in the keys and returns the result

mod tableSize . What could be simpler? The answer is that little could be sim-

pler. The function is easy to implement and computes a hash value very quickly.

However, if tableSize is large, the function does not distribute the keys well.

For instance, suppose that tableSize is 10,000. Also suppose that all keys are 8 or

fewer characters long. Because an ASCII char is an integer between 0 and 127,

the hash function can assume values only between 0 and 1,016 (127 × 8). This

restriction certainly does not permit an equitable distribution. Any speed gained

by the quickness of the hash function calculation is more than offset by the

effort taken to resolve a larger than expected number of collisions. However, a

reasonable alternative is described in Exercise 20.14.

Finally, note that 0 is a possible result of the hash function, so hash tables

are indexed starting at 0.

The table runs

from 0 to

tableSize-1 .

20.2.1 hashCode in java.lang.String

In Java, library types that can be reasonably inserted into a HashSet or as keys

into a HashMap already have equals and hashCode defined. In particular the

String class has a hashCode whose implementation is critical for performance

of HashSets and HashMaps involving String s.

The history of the String hashCode method is in itself fairly instructive.

The earliest versions of Java used essentially the same implementation as

Figure 20.2, including the constant multiplier 37, but without lines 14

16.

But later on the implementation was “optimized” so that if the String was

longer than 15 characters, only 8 or 9 characters, somewhat evenly spaced in

the String , would be used to compute the hashCode . This version was used

in Java 1.0.2 and Java 1.1, but it turned out to be a bad idea because there

were many applications containing large groups of long String s that were

−

Data Structures and Problem Solving Using Java

Search WWH ::

Custom Search

Home