Collection Classes - Beginning Java 2 SDK

Java Reference

In-Depth Information

The Hashing Process

A map sets aside an array in which it will store key and object pairs. The index to this array is produced

from the key object by using the hash code for the object to compute an offset into the array for storing

key/object pairs. By default, this uses the hashCode() method for the object that's used as a key. This

is inherited in all classes from Object .

Note that, while every key must be unique, each key doesn't have to result in a unique hash code. When two

or more different keys produce the same hash value, it's called a collision . A HashMap object deals with

collisions by storing all the key/object pairs that have the same hash value in a linked list. If this occurs very

often, it is obviously going to slow up the process of storing and retrieving data. Retrieving an object that

resulted in a collision when it was stored will be a two-stage process. The key will be hashed to find the

location where the key/object pair should be. The linked-list will then have to be searched to sort out the

particular key we are searching on from all the others that have the same hash value.

Hashing determines where

objects are placed in a

hash map

Bucket

Key

Object

Key

Object

Key

Object

There is therefore a strong incentive to minimize collisions and the price of reducing the possibility of

collisions in a hash table is having plenty of empty space in the table.

The class Object defines the method hashCode() so any object can be used as a key and it will hash

by default. The method as it is implemented in Object in Java, however, isn't a panacea. Since it

usually uses the memory address where an object is stored to produce the hash value, distinct objects

will always produce different hash values. In one sense this is a plus, because the more likely it is that a

unique hash value will be produced for each key, the more efficient the operation of the hash map is

going to be. The downside is that different object instances that have identical data will produce

different hash values, so you can't compare them.

This becomes a nuisance if you use the default hashCode() method in objects that you're using as

keys. In this case, an object stored in a hash map can never be retrieved using a different key object

instance, even though that key object may be identical in all other respects. Yet this is precisely what

you'll want to do in many cases.

Search WWH ::

Custom Search

Home