Database Reference
In-Depth Information
•
Searching for random records inefficient
•
Best if most common operation is range selection
Hash or Direct Organization
•
A hash function computes addresses based on values in specified field
•
Hash function specifies the block where a record must be placed
•
Search for record in a block done in memory buffer
•
Very fast retrieval for random data
•
Inserting specific record fast and direct
•
Deletion leaves wasted space in the block
•
Not good for sequential retrieval
•
Best if most common operation is equality selection
•
Hashing can cause collisions when many records hash to the same block
•
Methods for resolution of collisions (linear probing or multiple hashing)
Figure 12-8 illustrates the problem of collisions in direct file organization.
In this figure, you will note that a very simple hashing function is used—the last
two digits of the employee social security number are used to indicate the storage
address for the employee record. Employee 102-49-2121 comes on board, and his
record gets stored in Block 21. Then employee 352-11-3421 joins the company and
the hash algorithm calculates the storage address as 21, but Block 21 is already filled
File:
To store employee records
Hash field:
Social Security Number in employee record
Hash function:
Use last two digits of SSNo as storage address
Store employee 102-49-2121
Store employee 352-11-3421
Store employee 246-15-4721
synonym chain
Figure 12-8
Direct file organization: resolution of collisions.
Search WWH ::
Custom Search