Introducing Hashing - Data Structures and Abstractions with Java

Java Reference

In-Depth Information

Question 8 With distinct search keys and separate chaining with sorted chains, write an

algorithm for the dictionary's add method.

Question 9 Can you define an iteration of a dictionary's search keys in sorted order when

you use hashing in its implementation? Explain.

C HAPTER S UMMARY

•

Hashing is a dictionary implementation that stores entries into an array called the hash table. A hash function

transforms an entry's search key into the index of the array location that will contain the entry.

•

All classes have a method hashCode that returns an integer hash code. If a class's instances are to be search

keys, you should override hashCode to produce suitable hash codes. A hash code should depend on the entire

search key.

•

A hash function uses hashCode to compute a hash code from a search key and then compresses that hash

code into an index to the hash table. A typical way to compress the hash code c is to compute c modulo n,

where n is a prime number and the size of the hash table. This computation produces an index whose magni-

tude lies between 0 and n - 1.

•

A perfect hash function maps each search key into a distinct location in the hash table. You can find such a

function if you know all possible search keys. Using a perfect hash function makes possible O(1) implemen-

tations of the dictionary operations.

•

With a typical hash function, more than one search key can map into the same location in the hash table. This

occurrence is called a collision.

•

Various methods are available to deal with collisions. Among them are open addressing and separate

chaining.

•

With open addressing, all entries that map into the same location are ultimately stored within the hash table.

These entries are in a sequence of locations called the probe sequence. Several different versions of open

addressing are common. Linear probing uses consecutive locations. Quadratic probing spaces the locations

in a probe sequence at increasing increments. These increments are 1, 4, 9, and so on—that is, the squares of

the integers 1, 2, 3, . . . . Double hashing uses a fixed increment that depends on the search key. A second

hash function provides this increment.

•

With open addressing, you remove an entry by placing it into a removed state. You do not set its table

location to null , because that would terminate subsequent searches prematurely. You retrieve an entry by

searching its probe sequence, ignoring removed entries, until you either find the desired entry or encoun-

ter null . You perform the same search when you add a new entry, but while searching, you note the first

location—if any—that references a removed entry. You use this location for the added entry. If no such

location exists, the addition extends the probe sequence by using the null location encountered after

searching the entire sequence.

•

A disadvantage of linear probing and quadratic probing is clustering. Clustering lengthens a probe sequence

and so increases the time to search it. Double hashing avoids this problem.

•

With separate chaining, the hash table is an array of buckets. All entries that map into the same array loca-

tion are stored in the bucket that the location references. Each bucket can be a chain of linked nodes, for

example. That is, each location in the hash table can reference the beginning of a chain.

Data Structures and Abstractions with Java

Search WWH ::

Custom Search

Home