Java Reference
In-Depth Information
The Hashing Process
A map sets aside an array in which it will store key and object pairs. The index to this array is produced
from the key object by using the hash code for the object to compute an offset into the array for storing
key/object pairs. By default, this uses the hashCode() method for the object that's used as a key. This
is inherited in all classes from Object .
Note that, while every key must be unique, each key doesn't have to result in a unique hash code. When two
or more different keys produce the same hash value, it's called a collision . A HashMap object deals with
collisions by storing all the key/object pairs that have the same hash value in a linked list. If this occurs very
often, it is obviously going to slow up the process of storing and retrieving data. Retrieving an object that
resulted in a collision when it was stored will be a two-stage process. The key will be hashed to find the
location where the key/object pair should be. The linked-list will then have to be searched to sort out the
particular key we are searching on from all the others that have the same hash value.
Hashing determines where
objects are placed in a
hash map
Bucket
Key
Key
Object
Object
Key
Key
Object
Object
Key
Object
There is therefore a strong incentive to minimize collisions and the price of reducing the possibility of
collisions in a hash table is having plenty of empty space in the table.
The class Object defines the method hashCode() so any object can be used as a key and it will hash
by default. The method as it is implemented in Object in Java, however, isn't a panacea. Since it
usually uses the memory address where an object is stored to produce the hash value, distinct objects
will always produce different hash values. In one sense this is a plus, because the more likely it is that a
unique hash value will be produced for each key, the more efficient the operation of the hash map is
going to be. The downside is that different object instances that have identical data will produce
different hash values, so you can't compare them.
This becomes a nuisance if you use the default hashCode() method in objects that you're using as
keys. In this case, an object stored in a hash map can never be retrieved using a different key object
instance, even though that key object may be identical in all other respects. Yet this is precisely what
you'll want to do in many cases.
Search WWH ::




Custom Search