Java Reference
In-Depth Information
Note that this insertion process is guaranteed to keep all nodes at least half full.
For example, when we attempt to insert into a full internal node of a B-tree of order
four, there will now be five children that must be dealt with. The node is split into
two nodes containing two keys each, thus retaining the B-tree property. The middle
of the five children is promoted to its parent.
B + -Trees
10.5.1
The previous section mentioned that B-trees are universally used to implement
large-scale disk-based systems. Actually, the B-tree as described in the previ-
ous section is almost never implemented, nor is the 2-3 tree as described in Sec-
tion 10.4. What is most commonly implemented is a variant of the B-tree, called
the B + -tree. When greater efficiency is required, a more complicated variant known
as the B -tree is used.
When data are static, a linear index provides an extremely efficient way to
search. The problem is how to handle those pesky inserts and deletes. We could try
to keep the core idea of storing a sorted array-based list, but make it more flexible
by breaking the list into manageable chunks that are more easily updated. How
might we do that? First, we need to decide how big the chunks should be. Since
the data are on disk, it seems reasonable to store a chunk that is the size of a disk
block, or a small multiple of the disk block size. If the next record to be inserted
belongs to a chunk that hasn't filled its block then we can just insert it there. The
fact that this might cause other records in that chunk to move a little bit in the array
is not important, since this does not cause any extra disk accesses so long as we
move data within that chunk. But what if the chunk fills up the entire block that
contains it? We could just split it in half. What if we want to delete a record? We
could just take the deleted record out of the chunk, but we might not want a lot of
near-empty chunks. So we could put adjacent chunks together if they have only
a small amount of data between them. Or we could shuffle data between adjacent
chunks that together contain more data. The big problem would be how to find
the desired chunk when processing a record with a given key. Perhaps some sort
of tree-like structure could be used to locate the appropriate chunk. These ideas
are exactly what motivate the B + -tree. The B + -tree is essentially a mechanism for
managing a sorted array-based list, where the list is broken into chunks.
The most significant difference between the B + -tree and the BST or the stan-
dard B-tree is that the B + -tree stores records only at the leaf nodes. Internal nodes
store key values, but these are used solely as placeholders to guide the search. This
means that internal nodes are significantly different in structure from leaf nodes.
Internal nodes store keys to guide the search, associating each key with a pointer
to a child B + -tree node. Leaf nodes store actual records, or else keys and pointers
to actual records in a separate disk file if the B + -tree is being used purely as an
index. Depending on the size of a record as compared to the size of a key, a leaf
Search WWH ::




Custom Search