Information Technology Reference
In-Depth Information
is sequential readahead, which has long been
a standard practice among operating systems
(Feiertag and Organick, 1971; McKusick, Joy,
Leffler, and Fabry, 1984). There are also more
comprehensive works to mine correlations
between files (Kroeger and Long, 2001; Paris,
Amer, and Long, 2003; Whittle, Paris, Amer,
Long, and Burns, 2003) or data blocks (Li, Chen,
and Zhou, 2005).
On the other hand, informed prefetching works
with hints from individual applications about their
future I/O operations. The hints could either be
explicitly controlled by application (Cao, Felten,
Karlin, and Li, 1996; Patterson, Gibson, Ginting,
Stodolsky, and Zelenka, 1995; Patterson, Gibson,
and Satyanarayanan, 1993) or be automatically
generated (Brown, Mowry, and Krieger, 2001).
Caching is another ubiquitous performance
optimization technique. It is a common practice to
share prefetching memory with cache memory, this
opens the door to interactions between prefetching
and caching. Besides the basic defensive support
for readahead thrashing discussed in this chapter,
there are also comprehensive works dealing with
integrated prefetching and caching (Butt, Gniady,
and Hu, 2005; Cao, Felten, Karlin, and Li, 1995;
Cao, Felten, Karlin, and Li, 1996; Dini, Lettieri,
and Lopriore, 2006; Gill and Modha, 2005; Pat-
terson, Gibson, Ginting, Stodolsky, and Zelenka,
1995; Itshak and Wiseman, 2008) and schemes to
dynamically adapt the prefetching memory (Li and
Shen, 2005) or prefetch depth (Gill and Bathen,
2007; Li, Shen, and Papathanasiou, 2007; Liang,
Jiang, and Zhang, 2007).
Different storage devices, disk array configura-
tions and workloads have different optimal I/O
size. Some applications (e.g. FTP) are not sensi-
tive to I/O latency and therefore can safely use
a large readahead size; Some others (e.g. VOD)
may be sensitive to I/O latency and should use a
more conservative I/O size.
Besides the tradeoff between throughput and
latency, prefetching hit ratio is another common
design consideration. It is defined as: for all the
data pages faulted in by the prefetching algorithm,
how much of them are actually accessed before
being reclaimed. To keep prefetching hit ratio high,
adaptive readahead size shall be employed. This
is because even we are sure that an application
is doing sequential reads, we have no idea how
long the read sequence will last. For example, it
is possible that an application scans one file from
beginning to end, while only accesses the first two
pages in another file.
Fortunately, the common I/O behaviors are
somehow inferable. Firstly, the number of pages
to be read(not considering the end-of-file case) and
the number of pages have been accessed are nor-
mally positively correlated. Secondly, the larger
read size, the more possibility it will be repeated.
Because the large read size implies an optimiza-
tion made by the developer for an expected long
run of reads. Based on the above two empirical
rules, the possibility of the current access pattern
being repeated can be estimated, upon which an
adaptive prefetching size can be calculated.
The prefetching hit ratio has served as one
major goal in the design of prefetching algorithms.
A low hit ratio means waste of memory and disk
I/O resources. The cost was high and unacceptable
in a day when these resources are scare and pre-
cious. So traditional prefetching algorithms tend
to do prefetching for only strictly sequential reads.
They employ conservative policies on prefetching
size as well as sequential detection, seeking for a
high prefetching hit ratio. For example, Linux 2.6
has been using a conservative 128KB readahead
size since its infancy.
design tradeoffs of prefetching
The prefetching size can greatly impact I/O
performance and is considered as the main I/O
parameter. One must tradeoff between throughput
and latency on determining its value. The general
guideline is: prefetching size should be large
enough to deliver good I/O throughput and small
enough to prevent undesirable long I/O latencies.
Search WWH ::




Custom Search