Databases Reference
In-Depth Information
Index
Salesperson File
Salesperson
Record
Record
Salesperson
Salesperson
Number
Address
Number
Number
Name
City
119
1
1
119
Taylor
New York
137
2
2
137
Baker
Detroit
186
3
3
186
Adams
Dallas
204
4
4
204
Dickens
Dallas
F I G U R E 8.12
Salesperson file on the right with index
built over the Salesperson Number field,
on the left
255
5
5
255
Lincoln
Atlanta
361
6
6
361
Carlyle
Detroit
420
7
7
420
Green
Tucson
Figure 8.12 shows the Salesperson file with an index built over the Salesperson
Number field. This is an important concept known as an ''indexed-sequential file.''
In an indexed-sequential file, the file is stored on the disk in order based on a set
of field values (in this case the salesperson numbers) and an index is built over that
same field . This allows both sequential and direct access by the key field, which can
be an advantage when applications with different retrieval requirements share the
file. The odd thing about this index is that since the Salesperson file was already
in sequence by the Salesperson Number field, when the salesperson numbers were
copied over into the index they were already in sorted order! Further, for the same
reason, the record addresses are also in order. In fact, in Figure 8.12, the Salesperson
Number field in the Salesperson file, with the list of relative record numbers next
to it, appears to be identical to the index. But then, why bother having an index
built over the Salesperson Number field at all? In principle, the reason is that when
the search algorithm processes the salesperson numbers, they have to be in primary
memory. Again in principle, it would be much more efficient to bring the smaller
index into primary memory for this purpose than to bring the entire Salesperson file
in just to process the Salesperson Number field.
Why, in the last couple of sentences, did we keep using the phrase, ''in
principle?'' The answer to this is closely tied to the question of whether simple
linear indexes are practical for use in even moderately sized information systems
applications. And the answer is that they are not. One reason (and here is where
the ''in principle'' in the last paragraph come in) is that, even if the simple linear
index is made up of just two columns, it would still be clumsy to try to move all or
even parts of it into primary memory to use it in a search. At best, it would require
many read operations to the disk on which the index is located. The second reason
has to do with inserting new disk records. Look once again at the Salesperson file
and the index in Figure 8.10. Say that a new salesperson named French is hired
and assigned salesperson number 452. Her record can be inserted at the end of the
Salesperson file, where it would become record number 8. But the index would have
to be updated, too: an index record, French 8, would have to be inserted between
the index records for Dickens and Green to maintain the crucial alphabetic or sorted
sequence of the index, Figure 8.13. The problem is that there is no obvious way
to accomplish that insertion unless we move all the index records from Green to
Taylor down one record position. In even a moderate-size file, that would clearly
be impractical!
Search WWH ::




Custom Search