Information Technology Reference
In-Depth Information
head thrashings. The latter issue will be revisited
in section 5.3.
to help support the interleaved reads: It is more
stable than the per-file-descriptor readahead states,
and can tell if the readahead states are still valid
and dependable in the case of multiple readers in
one file descriptor. Then we quickly find it a handy
tool for handling other cases, as an integral part of
the following readahead call convention.
readahead windows and pipelining
Each time a readahead I/O decision is made, it is
recorded as a “readahead window”. A readahead
window takes the form of (start, size), where start
is the first page index and size is the number of
pages. The readahead window produced from
this run of readahead will be saved for reference
in the next run.
Pipelining is an old technique to enhance the
utilization of the computer components (Wiseman,
2001). Readahead pipelining is a technique to par-
allelize CPU and disk activities for better resource
utilization and I/O computing performance. The
legacy readahead algorithm adopts dual windows
to do pipelining: while the application is walking
in the current_window, I/O is underway asynchro-
nously in the ahead_window. Whenever the read
request is crossing into ahead_window, it becomes
current_window, and a readahead I/O will be trig-
gered to make the new ahead_window.
Readahead pipelining is all about doing
asynchronous readahead. The key question is
how early should the next readahead be started
asynchronously? The dual window scheme cannot
provide an exact answer, since both read request
and ahead_window are wide ranges.As a result it is
not able to control the “degree of asynchrony”.
Our solution is to introduce it as an explicit
parameter async_size: as soon as the number of
not-yet-consumed readahead pages falls under
this threshold, it is time to start next readahead.
async_size can be freely tuned in the range [0,
size]: async_size = 0 disables pipelining, whereas
async_size = size opens full pipelining. It avoids
maintaining two readahead windows and de-
couples async_size from size.
Figure 4 shows the data structures. Note that
we also tag the page at start + size - async_size
with PG_readahead. This newly introduced page
flag is one fundamental facility in our proposed
readahead framework. It was originally intended
call convention
The traditional practice is to feed every read
request to the readahead algorithm, and let it
handle all possible system states in the process
of sorting out the access patterns and making
readahead decisions. The handling everything
in one interception routine approach leads to
unnecessary invocations of readahead logic and
makes it unnecessarily complex.
There are also fundamental issues with read
requests. They may be too small(less than 1 page)
or too large(more than max_readahead pages) that
require special handling. What's more, they are
unreliable and confusing: the same pages may be
requested more than one times in the case of reada-
head thrashing and retried sequential reads.
The above observations lead us to two new
principles: Firstly, trap into the readahead heuris-
tics only when it is the right time to do readahead;
Secondly, judge by the page status instead of the
read requests and readahead windows when-
ever feasible. These principles yield a modular
readahead framework that separates the follow-
ing readahead trigger conditions with the main
readahead logic. When these two types of pages
are read, it is time to do readahead:
1.
cache miss page : it's time for synchronous
readahead. An I/O is required to fault in the
current page. The I/O size could be inflated
to piggy back more pages.
2.
PG_readaheadpage : it's time for asynchro-
nous readahead. The PG_readahead tag is
set by a previous readahead to indicate the
time to do next readahead.
Search WWH ::




Custom Search