Java Reference
In-Depth Information
1 // A poor hash function when tableSize is large
2 public static int hash( String key, int tableSize )
3 {
4 int hashVal = 0;
5
6 for( int i = 0; i < key.length( ); i++ )
7 hashVal += key.charAt( i );
8
9 return hashVal % tableSize;
10 }
figure 20.3
A bad hash function if
tableSize is large
Although speed is an important consideration in designing a hash function,
we also want to be sure that it distributes the keys equitably. Consequently, we
must not take our optimizations too far. An example is the hash function shown
in Figure 20.3. It simply adds the characters in the keys and returns the result
mod tableSize . What could be simpler? The answer is that little could be sim-
pler. The function is easy to implement and computes a hash value very quickly.
However, if tableSize is large, the function does not distribute the keys well.
For instance, suppose that tableSize is 10,000. Also suppose that all keys are 8 or
fewer characters long. Because an ASCII char is an integer between 0 and 127,
the hash function can assume values only between 0 and 1,016 (127 × 8). This
restriction certainly does not permit an equitable distribution. Any speed gained
by the quickness of the hash function calculation is more than offset by the
effort taken to resolve a larger than expected number of collisions. However, a
reasonable alternative is described in Exercise 20.14.
Finally, note that 0 is a possible result of the hash function, so hash tables
are indexed starting at 0.
The table runs
from 0 to
tableSize-1 .
20.2.1 hashCode in java.lang.String
In Java, library types that can be reasonably inserted into a HashSet or as keys
into a HashMap already have equals and hashCode defined. In particular the
String class has a hashCode whose implementation is critical for performance
of HashSets and HashMaps involving String s.
The history of the String hashCode method is in itself fairly instructive.
The earliest versions of Java used essentially the same implementation as
Figure 20.2, including the constant multiplier 37, but without lines 14
16.
But later on the implementation was “optimized” so that if the String was
longer than 15 characters, only 8 or 9 characters, somewhat evenly spaced in
the String , would be used to compute the hashCode . This version was used
in Java 1.0.2 and Java 1.1, but it turned out to be a bad idea because there
were many applications containing large groups of long String s that were
 
Search WWH ::




Custom Search