Database Reference
In-Depth Information
Searching for random records inefficient
Best if most common operation is range selection
Hash or Direct Organization
A hash function computes addresses based on values in specified field
Hash function specifies the block where a record must be placed
Search for record in a block done in memory buffer
Very fast retrieval for random data
Inserting specific record fast and direct
Deletion leaves wasted space in the block
Not good for sequential retrieval
Best if most common operation is equality selection
Hashing can cause collisions when many records hash to the same block
Methods for resolution of collisions (linear probing or multiple hashing)
Figure 12-8 illustrates the problem of collisions in direct file organization.
In this figure, you will note that a very simple hashing function is used—the last
two digits of the employee social security number are used to indicate the storage
address for the employee record. Employee 102-49-2121 comes on board, and his
record gets stored in Block 21. Then employee 352-11-3421 joins the company and
the hash algorithm calculates the storage address as 21, but Block 21 is already filled
File: To store employee records
Hash field: Social Security Number in employee record
Hash function: Use last two digits of SSNo as storage address
Store employee 102-49-2121
Store employee 352-11-3421
Store employee 246-15-4721
synonym chain
Figure 12-8
Direct file organization: resolution of collisions.
Search WWH ::




Custom Search