Java Reference
In-Depth Information
• Use as much working memory as possible. Applying more memory usually
speeds processing. In fact, more memory will have a greater effect than a
faster disk. A faster CPU is unlikely to yield much improvement in running
time for external sorting, because disk I/O speed is the limiting factor.
• If possible, use additional disk drives for more overlapping of processing
with I/O, and to allow for sequential file processing.
8.6
Further Reading
A good general text on file processing is Folk and Zoellick's File Structures: A
Conceptual Toolkit [FZ98]. A somewhat more advanced discussion on key issues in
file processing is Betty Salzberg's File Structures: An Analytical Approach [Sal88].
A great discussion on external sorting methods can be found in Salzberg's book.
The presentation in this chapter is similar in spirit to Salzberg's.
For details on disk drive modeling and measurement, see the article by Ruemm-
ler and Wilkes, “An Introduction to Disk Drive Modeling” [RW94]. See Andrew
S. Tanenbaum's Structured Computer Organization [Tan06] for an introduction to
computer hardware and organization. An excellent, detailed description of mem-
ory and hard disk drives can be found online at “The PC Guide,” by Charles M.
Kozierok [Koz05] ( www.pcguide.com ). The PC Guide also gives detailed de-
scriptions of the Microsoft Windows and UNIX (Linux) file systems.
See “Outperforming LRU with an Adaptive Replacement Cache Algorithm”
by Megiddo and Modha [MM04] for an example of a more sophisticated algorithm
than LRU for managing buffer pools.
The snowplow argument comes from Donald E. Knuth's Sorting and Searching
[Knu98], which also contains a wide variety of external sorting algorithms.
8.7
Exercises
8.1 Computer memory and storage prices change rapidly. Find out what the
current prices are for the media listed in Figure 8.1. Does your information
change any of the basic conclusions regarding disk processing?
8.2 Assume a disk drive from the late 1990s is configured as follows. The to-
tal storage is approximately 675MB divided among 15 surfaces. Each sur-
face has 612 tracks; there are 144 sectors/track, 512 bytes/sector, and 8 sec-
tors/cluster. The disk turns at 3600 rpm. The track-to-track seek time is
20 ms, and the average seek time is 80 ms. Now assume that there is a
360KB file on the disk. On average, how long does it take to read all of the
data in the file? Assume that the first track of the file is randomly placed on
the disk, that the entire file lies on adjacent tracks, and that the file completely
fills each track on which it is found. A seek must be performed each time the
I/O head moves to a new track. Show your calculations.
Search WWH ::




Custom Search