Introducing Hashing - Data Structures and Abstractions with Java

Java Reference

In-Depth Information

21.17

Reusing locations in the hash table during an addition. Recall the hash table pictured in

Figure 21-4. The entry whose search key is 555-2072 mapped into hashTable[52] but was added

to the hash table at hashTable[55] due to collisions. Figure 21-6a shows this hash table again, but

in a simpler form. The four occupied locations constitute a probe sequence; the other locations con-

tain null . Since the search key 555-2072 maps into the first location of the probe sequence but

actually occurs in the fourth location, a brief sequential search will find it.

Now let's try removing the middle two entries of the probe sequence, as Figure 21-6b shows.

A search for 555-2072 starts at the beginning of the probe sequence, must continue beyond the

removed entries, and stops successfully at the last location in the probe sequence. If 555-2072

does not occur in this last location, the search will end unsuccessfully at the next location, since

it contains null . Figure 21-6c illustrates these searches.

Finally, consider what happens when we add an entry that maps into this probe sequence.

For example, the search key 555-1062 maps into hashTable[52] . The add operation first must

see whether this search key is in the hash table already. To do so, it searches the probe

sequence. It has to search the entire probe sequence and reach a null location to discover that

555-1062 is not in the table. Figure 21-6d shows that this search ends at hashTable[56] .

Should add place the new entry in this location? It could, but that would fill the hash table

faster than if add reused a location that is presently in the available state. Two such locations

are at the indices 53 and 54. We should place the new entry at hashTable[53] —that is, closest

to the beginning of the probe sequence—so we can find it more quickly later. Figure 21-6e

illustrates the hash table after this addition.

Note: Searches that dictionary operations require when open addressing resolves

collisions

•

To retrieve an entry, getValue(key) searches the probe sequence for key . It examines

entries that are present and ignores locations that are in the available state. The search

stops when either key is found or null is reached.

•

The operation remove(key) searches the probe sequence using the same logic as a

retrieval. If it finds key , it marks the location as available.

•

The operation add(key, value) searches the probe sequence using logic like that of

a retrieval, but it also notes the index of the first location encountered that is either

in the available state or contains null . The operation uses this location for a new

entry if key is not found.

21.18

Clustering. Collisions that are resolved with linear probing cause groups of consecutive locations

in the hash table to be occupied. Each group is called a cluster , and the phenomenon is known as

primary clustering . Each cluster is actually a probe sequence that you must search when adding,

removing, or retrieving a table entry. When few collisions occur, probe sequences remain short and

can be searched rapidly. But during an addition, any collision within a cluster increases the size of

the cluster. Bigger clusters mean longer search times following a collision. As the clusters grow in

size, they can merge into even larger clusters, compounding the problem. This occurrence can place

many entries in one part of the hash table while another part is relatively empty.

Data Structures and Abstractions with Java

Search WWH ::

Custom Search

Home