Java Reference
In-Depth Information
Note: A perfect hash function maps each search key into a different integer that is suitable
as an index to the hash table.
21.4
Typical hashing. Because we need a database of all street addresses in the previous example, we
must have one entry in the hash table for each telephone number. Our perfect hash function needs
a hash table this large because it produces 10,000 different indices between 0 and 9999 from the
10,000 possible search keys. This hash table is always full if every telephone number in the 555
exchange is assigned. Although a full hash table is quite reasonable for this application, most
hash tables are not full and can even be sparse —that is, have only a few of their elements actually
in use.
For example, if our small town required only 700 telephone numbers, most of the 10,000-
location hash table would be unused. We would waste most of the space allocated to the hash
table. If the 700 numbers were not sequential, we would need a different hash function if we
wanted to use a smaller hash table.
We might develop this hash function as follows. Given a nonnegative integer i and a hash
table with n locations, the value of i modulo n ranges from 0 to n - 1. Since i is nonnegative, i
modulo n is the integer remainder after dividing i by n . This value is a valid index for the hash
table. So, a hash function h for a telephone number could have the following algorithm:
Algorithm getHashIndex(phoneNumber)
// Returns an index to an array of tableSize locations.
i = last four digits of phoneNumber
return i % tableSize
This hash function—like typical hash functions—performs two steps:
1.
Convert the search key to an integer called the hash code .
2.
Compress the hash code into the range of indices for the hash table.
Often the search key is not an integer, and frequently it is a string. So, a hash function first converts
the key to an integer hash code. Next, it transforms that integer into one that is suitable as an index
to the particular hash table.
The hash function that the algorithm getHashIndex describes is not a perfect hash function when
tableSize is less than 10,000. Since 10,000 telephone numbers map into tableSize indices, some
telephone numbers will map into the same index. We call such an occurrence a collision . For exam-
ple, if tableSize is 101, getHashIndex("555-1214") and getHashIndex("555-8132") each map
into 52. If we have already stored the street address for 555-1214 in hashTable[52] , as Figure 21-2
shows, what will we do with the address for 555-8132? Handling such collisions is called collision
resolution . Before we look at collision resolution, we explore hash functions a bit further.
Note: Typical hash functions are not perfect, because they can allow more than one search
key to map into a single index, causing a collision in the hash table.
 
Search WWH ::




Custom Search