Database Reference
In-Depth Information
cases are different from patterns seen in enterprise settings. Two access pattern
studies were performed in the mid-1990s, the CHARISMA project 15 and the
Scalable I/O Initiative Applications Group study. 16 Although applications
and systems have changed some since these studies were performed, many of
their conclusions about application access patterns are still valid today. We
summarize those findings here.
The CHARISMA study was conducted on two systems, a Thinking Ma-
chines CM-5 and an Intel iPSC/860, both with numerous scientists running
on them. The team was hoping to discover some commonalities between par-
allel applications on the two machines in terms of the number of files read and
written, the size of the files, the typical read and write request sizes, and the
way these requests were spaced and ordered.
The team found it important to differentiate between sequential and con-
secutive requests, where sequential accesses begin at a higher file offset than
the point at which the previous request from the same process ended, but con-
secutive ones begin at exactly the point where the last request ended. Almost
all write-only files were accessed sequentially, and many of the read-only files
were as well. Most of the write-only files were written consecutively, probably
because in many applications each process wrote out its data to a separate
file. Read-only files were accessed consecutively much less often, indicating
that they were read by multiple applications. Overall, about a third of the
files were accessed with a single request.
Examining the sizes of the intervals between requests, the team found that
most files were accessed with only one or two interval sizes. The request sizes
were also very regular, with most applications using no more than three dis-
tinct request sizes. Tracing showed that a simple strided access pattern was
most common, with a consistent amount of data skipped between each data
item accessed. Nested strided patterns were also common, indicating that mul-
tidimensional data was being accessed within the file, but occurred about half
as often as the simple strided pattern.
The CHARISMA project team concluded that parallel I/O consists of a
wide variety of request sizes, that these requests often occur in sequential but
not consecutive patterns, and that there is a great deal of interprocess spatial
locality on I/O nodes. They believed strided I/O request support from the
programmer's interface down to the I/O node to be important for parallel
I/O systems because it can effectively increase request sizes, thereby lowering
overhead and providing opportunities for low-level optimization.
The Scalable I/O Initiative Applications Group study was performed us-
ing three I/O-intensive scientific applications designed for the Paragon. The
goal of this work was to observe the patterns of access of these applications,
determine what generalizations could be made about the patterns in high-
performance applications, and discuss with the authors the reasons for choos-
ing certain approaches.
The group found that the three applications exhibited a variety of ac-
cess patterns and requests sizes; no simple characterization could be made.
Search WWH ::




Custom Search