Database Reference
In-Depth Information
The load factor f is defined as the number of records (to be) stored in the table to the
actual size of the table. Thus,
As items are added to the table, blocks of occupied cells start forming. This is
referred to as primary clustering . The result is increased likelihood of collision and
lengthened probes for subsequent items to be inserted. As the table fills up these searches
become longer.
For insertions and unsuccessful searches, the expected number of probes
is ½[1 + 1/(1 - f ) 2 ]. For successful searches, the expected number of probes is
½[1 + 1/(1 - f )]. The derivation of these formulae is beyond the scope of this course.
Linear probing has the following advantages:
The technique is conceptually simple and easy to implement.
The amount of calculation done at each stage is very small.
Disadvantages of linear probing include the following:
Primary clustering — the hash-table fills up in clusters, rather
than in even distribution over the address space.
Primary clustering leads to longer search and increased
likelihood of collision, particularly as the file becomes full.
Linear probing sometimes described as an open addressing strategy. Another
commonly discussed open addressing strategy is quadratic probing . Since its
performance is not much different from linear probing, it will not be discussed any
further. Rehashing is another open addressing strategy; it will be discussed shortly.
A2.3.2 Synonym Chaining
The synonym chaining technique uses the hash address, not as the actual storage
location for the record, but as the index into a list (implemented as a linked list or array-
list). This way, we can facilitate multiple records with the same hash address! Figure A2-7
provides a graphic representation of this approach.
 
Search WWH ::




Custom Search