Java Reference
In-Depth Information
99 DataInputStream input =
100
new
DataInputStream(
new
FileInputStream(filename));
101
for
(
int
i =
0
; i <
100
; i++)
102 System.out.print(input.readInt() +
" "
);
103 input.close();
104 }
105
catch
(IOException ex) {
106 ex.printStackTrace();
107 }
108 }
109 }
0 1 1 1 2 2 2 3 3 4 5 6 8 8 9 9 9 10 10 11 . . . (omitted)
Before you run this program, first run Listing 23.11, CreateLargeFile.java, to create the
file largedata.dat. Invoking
sort("largedata.dat", "sortedfile.dat")
(line 9)
reads data from
largedata.dat
and writes sorted data to
sortedfile.dat
. Invoking
displayFile("sortedfile.dat")
(line 12) displays the first
100
numbers in the speci-
fied file. Note that the files are created using binary I/O. You cannot view them using a text
editor such as Notepad.
The
sort
method first creates initial segments from the original array and stores the
sorted segments in a new file, f1.dat (lines 19-20), then produces a sorted file in
targetfile
(lines 23-24).
The
merge
method
merge(
int
numberOfSegments,
int
segmentSize,
String f1, String f2, String f3, String targetfile)
merges the segments in
f1
into
f3
using
f2
to assist the merge. The
merge
method is invoked
recursively with many merge steps. Each merge step reduces the
numberOfSegments
by
half and doubles the sorted segment size. After the completion of one merge step, the next
merge step merges the new segments in
f3
to
f2
using
f1
to assist the merge. The statement
to invoke the new merge method is
merge((numberOfSegments +
1
) /
2
, segmentSize *
2
,
f3, f1, f2, targetfile);
The
numberOfSegments
for the next merge step is
(numberOfSegments + 1) / 2
. For
example, if
numberOfSegments
is
5
,
numberOfSegments
is
3
for the next merge step,
because every two segments are merged but one is left unmerged.
The recursive
merge
method ends when
numberOfSegments
is
1
. In this case,
f1
con-
tains sorted data. File
f1
is renamed to
targetfile
(line 45).
23.8.4 External Sort Complexity
In the external sort, the dominating cost is that of I/O. Assume
n
is the number of elements to
be sorted in the file. In Phase I,
n
number of elements are read from the original file and output
to a temporary file. Therefore, the I/O for Phase I is
O(n)
.
In Phase II, before the first merge step, the number of sorted segments is
n
c
, where
c
is
MAX_ARRAY_SIZE
. Each merge step reduces the number of segments by half. Thus, after the
first merge step, the number of segments is
n
2
c
. After the second merge step, the number of
n
2
2
c
, and after the third merge step the number of segments is
n
2
3
c
. After log
n
c
segments is
¢
≤
Search WWH ::
Custom Search