Java Reference
In-Depth Information
Records within a slot's list can be ordered in several ways: by insertion order,
by key value order, or by frequency-of-access order. Ordering the list by key value
provides an advantage in the case of an unsuccessful search, because we know to
stop searching the list once we encounter a key that is greater than the one being
searched for. If records on the list are unordered or ordered by frequency, then an
unsuccessful search will need to visit every record on the list.
Given a table of size M storing N records, the hash function will (ideally)
spread the records evenly among the M positions in the table, yielding on average
N=M records for each list. Assuming that the table has more slots than there are
records to be stored, we can hope that few slots will contain more than one record.
In the case where a list is empty or has only one record, a search requires only one
access to the list. Thus, the average cost for hashing should be (1). However, if
clustering causes many records to hash to only a few of the slots, then the cost to
access a record will be much higher because many elements on the linked list must
be searched.
Open hashing is most appropriate when the hash table is kept in main memory,
with the lists implemented by a standard in-memory linked list. Storing an open
hash table on disk in an efficient way is difficult, because members of a given
linked list might be stored on different disk blocks. This would result in multiple
disk accesses when searching for a particular key value, which defeats the purpose
of using hashing.
There are similarities between open hashing and Binsort. One way to view
open hashing is that each record is simply placed in a bin. While multiple records
may hash to the same bin, this initial binning should still greatly reduce the number
of records accessed by a search operation. In a similar fashion, a simple Binsort
reduces the number of records in each bin to a small number that can be sorted in
some other way.
9.4.3
Closed Hashing
Closed hashing stores all records directly in the hash table. Each record R with key
value k R has a home position that is h(k R ), the slot computed by the hash function.
If R is to be inserted and another record already occupies R's home position, then
R will be stored at some other slot in the table. It is the business of the collision
resolution policy to determine which slot that will be. Naturally, the same policy
must be followed during search as during insertion, so that any record not found in
its home position can be recovered by repeating the collision resolution process.
Bucket Hashing
One implementation for closed hashing groups hash table slots into buckets. The
M slots of the hash table are divided into B buckets, with each bucket consisting
Search WWH ::




Custom Search