Database Reference
In-Depth Information
I0
<x0,I1> <x4, I2>
I1
I2
<x0,B0> <x2, B1>
<x4,B2> <x6, B3>
B0
B1
B2
B3
x0{15}, x1{5,6}
x2{12,14}, x3{3,4}
x4{10}, x5{8,9,15}
x6{1,2}, x7{7,13}
Figure 6.3 Illustrative tree-structured index by ISAM organization. The
data blocks are labeled I0,
...
, I2, B0,
...
, B3.
given a file of N data items. Note that we are concerned with
indexing very large datasets so that in practice b
O
(
log b (
N
/
b
))
2. In this scheme the
ISAM index was mapped directly to the layout of data on a disk storage where
the root level index searches give the proper cylinder number of the record.
The first track of each cylinder gives the track number, and this corresponds
to the second level of indexing for searches. At the lowest level, this corre-
sponds to the locations of records within a track. A sequential scan at this
level is used to locate the records. The ISAM index method is a static index
method. Subsequent insertions require the records to be managed as overflow
records that are periodically merged by reorganizing the entire ISAM index.
The indexed sequential organization illustrates the structural characteristic
of tree-structured index schemes. To circumvent the static limitations of the
ISAM, the B -Tree indexing scheme was developed. Detailed coverage of the
B -Tree and its variants such as B + -Tree, B -Tree, and Prefix- B -Tree is given in
References 9, 24, 25, and 47. The B -Tree is a dynamic, height-balanced index
scheme that grows and shrinks by recursively splitting and merging nodes
from the lowest level of the index tree up to the root node. The VSAM file
organization 55
>>
is a B + -Tree that is mapped directly to the layout of data on
disk storage.
Tree-structured index schemes typically maintain the fixed-sized keys, such
as integers or fixed-length character strings, in the index nodes. When the
keys are variable-length strings, then rather than storing the entire keys, only
sucient strings of leading characters that form separator keys are stored.
This is the basic idea of the prefix- B -Tree. An alternative index method for
long alphabetic strings is the use of a trie . 39 , 47 , 79 , 99 Suppose the keys are formed
from a domain of alphabet set
. Each node of the trie-
index at level i is comprised of all occurrences of the distinct i th characters of
keys with the same i
with cardinality
| |
1 prefix string. A node in a trie index structure has size
of at most
entries, where each entry is pair of a character and a pointer
to the lower level node.
A special kind of trie, called the sux tree, 37 , 39 , 43 , 47 can be used to index
all suxes in a text in order to carry out fast full or partial text searches. A
basic trie implementation has the disadvantage of having a single path that
| |
Search WWH ::




Custom Search