THE PHYSICAL DESIGN PROCESS - Database Design and Development: An Essential Guide for IT Professionals

Database Reference

In-Depth Information

• Searching for random records inefficient

• Best if most common operation is range selection

Hash or Direct Organization

• A hash function computes addresses based on values in specified field

• Hash function specifies the block where a record must be placed

• Search for record in a block done in memory buffer

• Very fast retrieval for random data

• Inserting specific record fast and direct

• Deletion leaves wasted space in the block

• Not good for sequential retrieval

• Best if most common operation is equality selection

• Hashing can cause collisions when many records hash to the same block

• Methods for resolution of collisions (linear probing or multiple hashing)

Figure 12-8 illustrates the problem of collisions in direct file organization.

In this figure, you will note that a very simple hashing function is used—the last

two digits of the employee social security number are used to indicate the storage

address for the employee record. Employee 102-49-2121 comes on board, and his

record gets stored in Block 21. Then employee 352-11-3421 joins the company and

the hash algorithm calculates the storage address as 21, but Block 21 is already filled

File: To store employee records

Hash field: Social Security Number in employee record

Hash function: Use last two digits of SSNo as storage address

Store employee 102-49-2121

Store employee 352-11-3421

Store employee 246-15-4721

synonym chain

Figure 12-8

Direct file organization: resolution of collisions.

Search WWH ::

Custom Search

Home