Java Reference

In-Depth Information

If there are no or only a few collisions, then adding, locating, and removing hash

table elements takes constant or O(1) time.

More generally, for this algorithm to be effective, the bucket sizes must be small. If

the table length is small, then collisions are unavoidable, and each bucket will get

quite full. Then the linear search through a bucket is time-consuming. In the worst

case, where all elements end up in the same bucket, a hash table degenerates into a

linked list!

In order to reduce the chance for collisions, you should make a hash table somewhat

larger than the number of elements that you expect to insert. An excess capacity of

about 30 percent is typically recommended. According to some researchers, the hash

table size should be chosen to be a prime number to minimize the number of

collisions.

The table size should be a prime number, larger than the expected number of

elements.

Adding an element is a simple extension of the algorithm for finding an object. First

compute the hash code to locate the bucket in which the element should be inserted.

Try finding the object in that bucket. If it is already present, do nothing. Otherwise,

insert it.

Removing an element is equally simple. First compute the hash code to locate the

bucket in which the element should be inserted. Try finding the object in that bucket.

If it is present, remove it. Otherwise, do nothing.

As long as there are few collisions, an element can also be added or removed in

constant or O(1) time.

At the end of this section you will find the code for a simple implementation of a hash

set. That implementation takes advantage of the
AbstractSet
class, which already

implements most of the methods of the
Set
interface.

In this implementation you must specify the size of the hash table. In the standard

library, you don't need to supply a table size. If the hash table gets too full, a new

table of twice the size is created, and all elements are inserted into the new table.