PHYSICAL DATABASE DESIGN - Fundamentals of Database Management Systems

Databases Reference

In-Depth Information

Index

Salesperson File

Salesperson

Record

Salesperson

Number

Address

Number

Name

City

119

Taylor

New York

137

Baker

Detroit

186

Adams

Dallas

204

Dickens

Dallas

F I G U R E 8.12

Salesperson file on the right with index

built over the Salesperson Number field,

on the left

255

Lincoln

Atlanta

361

Carlyle

Detroit

420

Green

Tucson

Figure 8.12 shows the Salesperson file with an index built over the Salesperson

Number field. This is an important concept known as an ''indexed-sequential file.''

In an indexed-sequential file, the file is stored on the disk in order based on a set

of field values (in this case the salesperson numbers) and an index is built over that

same field . This allows both sequential and direct access by the key field, which can

be an advantage when applications with different retrieval requirements share the

file. The odd thing about this index is that since the Salesperson file was already

in sequence by the Salesperson Number field, when the salesperson numbers were

copied over into the index they were already in sorted order! Further, for the same

reason, the record addresses are also in order. In fact, in Figure 8.12, the Salesperson

Number field in the Salesperson file, with the list of relative record numbers next

to it, appears to be identical to the index. But then, why bother having an index

built over the Salesperson Number field at all? In principle, the reason is that when

the search algorithm processes the salesperson numbers, they have to be in primary

memory. Again in principle, it would be much more efficient to bring the smaller

index into primary memory for this purpose than to bring the entire Salesperson file

in just to process the Salesperson Number field.

Why, in the last couple of sentences, did we keep using the phrase, ''in

principle?'' The answer to this is closely tied to the question of whether simple

linear indexes are practical for use in even moderately sized information systems

applications. And the answer is that they are not. One reason (and here is where

the ''in principle'' in the last paragraph come in) is that, even if the simple linear

index is made up of just two columns, it would still be clumsy to try to move all or

even parts of it into primary memory to use it in a search. At best, it would require

many read operations to the disk on which the index is located. The second reason

has to do with inserting new disk records. Look once again at the Salesperson file

and the index in Figure 8.10. Say that a new salesperson named French is hired

and assigned salesperson number 452. Her record can be inserted at the end of the

Salesperson file, where it would become record number 8. But the index would have

to be updated, too: an index record, French 8, would have to be inserted between

the index records for Dickens and Green to maintain the crucial alphabetic or sorted

sequence of the index, Figure 8.13. The problem is that there is no obvious way

to accomplish that insertion unless we move all the index records from Green to

Taylor down one record position. In even a moderate-size file, that would clearly

be impractical!

Fundamentals of Database Management Systems

Search WWH ::

Custom Search

Home