Java Reference
In-Depth Information
Input
Output
File
Input Buffer
RAM
Output Buffer
Run File
Figure8.7 Overview of replacement selection. Input records are processed se-
quentially. Initially RAM is filled with M records. As records are processed, they
are written to an output buffer. When this buffer becomes full, it is written to disk.
Meanwhile, as replacement selection needs records, it reads them from the input
buffer. Whenever this buffer becomes empty, the next block of records is read
from disk.
the input and output files are streams of records. Replacement selection takes the
next record in sequential order from the input stream when needed, and outputs
runs one record at a time to the output stream. Buffering is used so that disk I/O is
performed one block at a time. A block of records is initially read and held in the
input buffer. Replacement selection removes records from the input buffer one at
a time until the buffer is empty. At this point the next block of records is read in.
Output to a buffer is similar: Once the buffer fills up it is written to disk as a unit.
This process is illustrated by Figure 8.7.
Replacement selection works as follows. Assume that the main processing is
done in an array of size M records.
1. Fill the array from disk. Set LAST = M 1.
2. Build a min-heap. (Recall that a min-heap is defined such that the record at
each node has a key value less than the key values of its children.)
3. Repeat until the array is empty:
(a) Send the record with the minimum key value (the root) to the output
buffer.
(b) Let R be the next record in the input buffer. If R's key value is greater
than the key value just output ...
i. Then place R at the root.
ii. Else replace the root with the record in array position LAST, and
place R at position LAST. Set LAST = LAST 1.
(c) Sift down the root to reorder the heap.
When the test at step 3(b) is successful, a new record is added to the heap,
eventually to be output as part of the run. As long as records coming from the input
file have key values greater than the last key value output to the run, they can be
safely added to the heap. Records with smaller key values cannot be output as part
of the current run because they would not be in sorted order. Such values must be
Search WWH ::




Custom Search