Java Reference
In-Depth Information
The worst case for hashing generally results from an implementation
error, whereas sorted input can make binary search trees perform poorly. Bal-
anced search trees are quite expensive to implement. Hence, if no ordering
information is required and there is any suspicion that the input might be
sorted, hashing is the data structure of choice.
hashing applications
20.7
Hashing applications are abundant. Compilers use hash tables to keep track of
declared variables in source code. The data structure is called a symbol table .
Hash tables are the ideal application for this problem because only insert and
find operations are performed. Identifiers are typically short, so the hash func-
tion can be computed quickly. In this application, most searches are successful.
Another common use of hash tables is in game programs. As the program
searches through different lines of play, it keeps track of positions that it has
encountered by computing a hash function based on the position (and storing
its move for that position). If the same position recurs, usually by a simple
transposition of moves, the program can avoid expensive recomputation. This
general feature of all game-playing programs is called the transposition table .
We discussed this feature in Section 10.2, where we implemented the tic-tac-
toe algorithm.
A third use of hashing is in online spelling checkers. If misspelling
detection (as opposed to correction) is important, an entire dictionary can be
prehashed and words can be checked in constant time. Hash tables are well
suited for this purpose because the words do not have to be alphabetized.
Printing out misspellings in the order they occurred in the document is
acceptable.
Hashing applica-
tions are abundant.
summary
Hash tables can be used to implement the insert and find operations in con-
stant average time. Paying attention to details such as load factor is especially
important in the use of hash tables; otherwise, the constant time bounds are
not meaningful. Choosing the hash function carefully is also important when
the key is not a short string or integer. You should pick an easily computable
function that distributes well.
For separate chaining hashing, the load factor is typically close to 1,
although performance does not significantly degrade unless the load factor
becomes very large. For quadratic probing, the table size should be prime and
the load factor should not exceed 0.5. Rehashing should be used for quadratic
probing to allow the table to grow and maintain the correct load factor. This
 
 
Search WWH ::




Custom Search