Sequential File Prefetching in Linux - Advanced Operating Systems and Kernel Applications

Information Technology Reference

In-Depth Information

head thrashings. The latter issue will be revisited

in section 5.3.

to help support the interleaved reads: It is more

stable than the per-file-descriptor readahead states,

and can tell if the readahead states are still valid

and dependable in the case of multiple readers in

one file descriptor. Then we quickly find it a handy

tool for handling other cases, as an integral part of

the following readahead call convention.

readahead windows and pipelining

Each time a readahead I/O decision is made, it is

recorded as a “readahead window”. A readahead

window takes the form of (start, size), where start

is the first page index and size is the number of

pages. The readahead window produced from

this run of readahead will be saved for reference

in the next run.

Pipelining is an old technique to enhance the

utilization of the computer components (Wiseman,

2001). Readahead pipelining is a technique to par-

allelize CPU and disk activities for better resource

utilization and I/O computing performance. The

legacy readahead algorithm adopts dual windows

to do pipelining: while the application is walking

in the current_window, I/O is underway asynchro-

nously in the ahead_window. Whenever the read

request is crossing into ahead_window, it becomes

current_window, and a readahead I/O will be trig-

gered to make the new ahead_window.

Readahead pipelining is all about doing

asynchronous readahead. The key question is

how early should the next readahead be started

asynchronously? The dual window scheme cannot

provide an exact answer, since both read request

and ahead_window are wide ranges.As a result it is

not able to control the “degree of asynchrony”.

Our solution is to introduce it as an explicit

parameter async_size: as soon as the number of

not-yet-consumed readahead pages falls under

this threshold, it is time to start next readahead.

async_size can be freely tuned in the range [0,

size]: async_size = 0 disables pipelining, whereas

async_size = size opens full pipelining. It avoids

maintaining two readahead windows and de-

couples async_size from size.

Figure 4 shows the data structures. Note that

we also tag the page at start + size - async_size

with PG_readahead. This newly introduced page

flag is one fundamental facility in our proposed

readahead framework. It was originally intended

call convention

The traditional practice is to feed every read

request to the readahead algorithm, and let it

handle all possible system states in the process

of sorting out the access patterns and making

readahead decisions. The handling everything

in one interception routine approach leads to

unnecessary invocations of readahead logic and

makes it unnecessarily complex.

There are also fundamental issues with read

requests. They may be too small(less than 1 page)

or too large(more than max_readahead pages) that

require special handling. What's more, they are

unreliable and confusing: the same pages may be

requested more than one times in the case of reada-

head thrashing and retried sequential reads.

The above observations lead us to two new

principles: Firstly, trap into the readahead heuris-

tics only when it is the right time to do readahead;

Secondly, judge by the page status instead of the

read requests and readahead windows when-

ever feasible. These principles yield a modular

readahead framework that separates the follow-

ing readahead trigger conditions with the main

readahead logic. When these two types of pages

are read, it is time to do readahead:

1.

cache miss page : it's time for synchronous

readahead. An I/O is required to fault in the

current page. The I/O size could be inflated

to piggy back more pages.

2.

PG_readaheadpage : it's time for asynchro-

nous readahead. The PG_readahead tag is

set by a previous readahead to indicate the

time to do next readahead.

Advanced Operating Systems and Kernel Applications

Search WWH ::

Custom Search

Home