Java Reference
In-Depth Information
M = 3, after the runs have been constructed, the tapes contain the data, as
shown in Figure 21.30.
Now B1 and B2 contain a group of runs. We take the first runs from each tape,
merge them, and write the result—which is a run twice as long—to A1. Then we
take the next runs from each tape, merge them, and write the result to A2. We con-
tinue this process, alternating output to A1 and A2 until either B1 or B2 is empty. At
this point, either both are empty or one (possibly short) run is left. In the latter case,
we copy this run to the appropriate tape. We rewind all four tapes and repeat the
same steps, this time using the A tapes as input and the B tapes as output. This pro-
cess gives runs of length 4 M . We continue this process until we get one run of length
N, at which point the run represents the sorted arrangement of the input. Figures
21.31- 21.33 show how this process works for our sample input.
The algorithm will require log( N / M ) passes, plus the initial run-
constructing pass. For instance, if we have 10,000,000 records of 6,400 bytes
each and 200 MB of internal memory, the first pass creates 320 runs. We
We need
log( N/M ) passes
over the input
before we have one
giant run.
figure 21.30
Distribution of length
3 runs to two tapes
A1
A2
B1
11
81 4 7 8
99
15
B2
12
35
96
41
58
75
figure 21.31
Tapes after the first
round of merging (run
length = 6)
A1
11
12
35
81 4 6 5
A2
17
28
41
58
75
99
B1
B2
figure 21.32
Tapes after the
second round of
merging (run length
= 12)
A1
A2
B1
11
12
17
28
35
41
58
75
81 4 6 9
B2
15
Search WWH ::




Custom Search