Information Technology Reference
In-Depth Information
History-Aware Prefetching
while, the more likely the blocks in the caching
area could be evicted due to less available space.
Essentially, we attempt to balance the performance
gain of reduced memory misses through aggres-
sive prefetching and the loss of increased memory
misses due to less memory space for caching.
DiskSeen adaptively allocate memory for the
prefetching area and caching area to maximize
system performance, as follows. We extend the
reclamation queue in the prefetching area with a
segment of 2048 blocks to hold the metadata of
blocks evicted from the queue. We also set up a
FIFO queue, of the same size as the segment for
the caching area to hold the metadata of blocks
evicted from the caching area. The runtime is
divided into epochs, during which N p-area disk
blocks are requested, where N p-area is a sample of
current sizes of the prefetching area in blocks.
In each epoch we monitor the numbers of hits to
these two segments, suppose they are H prefetch and
H cache , respectively. If |( H prefetch - H cache )|/ N p-area is
larger than 10%, we move 128 blocks of memory
from the area with fewer hits to the other area to
balance the misses between the two. The intuition
behind this design is that, by tracking the hits to
the blocks evicted from both areas, we can know
increasing the size for which area can remove
more memory misses. So we always attempt to
allocate more memory space to the area that can
save more memory misses.
After we identify a history trail that matches the
current trail for a small number (e.g. 4 blocks
in our prototype) of blocks, the history-aware
prefetching is initiated. In order to identify the
data blocks for prefetching, we set up a trail extent
centered at the last matched block, say block B .
Then we follow the history trails from B in the
extent to obtain a set of blocks that are accessed
in the matched history trails. Suppose t is an ac-
cess index of block B that is used in forming a
matched history trail, and T is access index gap
threshold. We then search the extent for the blocks
with an access index between t and t+T , and we
can get a set of blocks for prefetch. We prefetch
the non-resident blocks in the order of their LBNs
to minimize the disk positioning overhead.
Similar to the sequence-based prefetching,
we also adopt two windows for history-aware
prefetching. These prefetched blocks are placed
in the current window first. Starting from the
last prefetched block, we further prefetch blocks
into a readahead window. The initial sizes of two
windows are 8 blocks. The prefetching is stopped,
if the window size is less than min (8 blocks).
When the window size is larger than max (64
blocks), only the first max blocks are prefetched
into memory. There must be at least one matched
history trail; otherwise, history-aware prefetch-
ing is aborted. Sometimes multiple matched
history trails may be found, we only prefetch
the intersection of these trails. The two history-
aware windows are shifted forward in the same
way as in the sequence-based prefetching. If the
history-aware prefetching aborts, sequence-based
prefetching is attempted.
implementation iSSueS
In order to evaluate the performance of DiskSeen in
a mainstream operating system, we implemented a
prototype of DiskSeen in the Linux 2.6.11 kernel.
The current Linux kernel conducts a file-level
prefetching at generic file system level. Similar
to the sequence-based prefetching in DiskSeen,
two windows are maintained to track the access
pattern for each opened file. When a sequential
access pattern is detected, the logic data blocks
in the file are prefetched.
balancing memory allocation
The memory allocation between the prefetching
and caching area can affect system performance.
Obviously, the larger the prefetching area is, the
more prefetched data block can be held, but mean-
Search WWH ::




Custom Search