Java Reference
In-Depth Information
could be sorted or organized using a tree structure, thereby imposing a logical or-
der on the records without physically rearranging them. One database might have
several associated index files, each supporting efficient access through a different
key field.
Each record of a database normally has a unique identifier, called the primary
key. For example, the primary key for a set of personnel records might be the
Social Security number or ID number for the individual. Unfortunately, the ID
number is generally an inconvenient value on which to perform a search because
the searcher is unlikely to know it. Instead, the searcher might know the desired
employee's name. Alternatively, the searcher might be interested in finding all
employees whose salary is in a certain range. If these are typical search requests
to the database, then the name and salary fields deserve separate indices. However,
key values in the name and salary indices are not likely to be unique.
A key field such as salary, where a particular key value might be duplicated in
multiple records, is called a secondary key. Most searches are performed using a
secondary key. The secondary key index (or more simply, secondary index) will
associate a secondary key value with the primary key of each record having that
secondary key value. At this point, the full database might be searched directly
for the record with that primary key, or there might be a primary key index (or
primary index) that relates each primary key value with a pointer to the actual
record on disk. In the latter case, only the primary index provides the location of
the actual record on disk, while the secondary indices refer to the primary index.
Indexing is an important technique for organizing large databases, and many
indexing methods have been developed. Direct access through hashing is discussed
in Section 9.4. A simple list sorted by key value can also serve as an index to the
record file. Indexing disk files by sorted lists are discussed in the following section.
Unfortunately, a sorted list does not perform well for insert and delete operations.
A third approach to indexing is the tree index. Trees are typically used to or-
ganize large databases that must support record insertion, deletion, and key range
searches. Section 10.2 briefly describes ISAM, a tentative step toward solving the
problem of storing a large database that must support insertion and deletion of
records. Its shortcomings help to illustrate the value of tree indexing techniques.
Section 10.3 introduces the basic issues related to tree indexing. Section 10.4 in-
troduces the 2-3 tree, a balanced tree structure that is a simple form of the B-tree
covered in Section 10.5. B-trees are the most widely used indexing method for
large disk-based databases, and for implementing file systems. Since they have
such great practical importance, many variations have been invented. Section 10.5
begins with a discussion of the variant normally referred to simply as a “B-tree.”
Section 10.5.1 presents the most widely implemented variant, the B + -tree.
Search WWH ::




Custom Search