Information Technology Reference
In-Depth Information
specific to that used in general-purpose operat-
ing systems.
systems generally attempt to map data
blocks that are logically continuous in one
file also physically continuous on the hard
disk, as the file system ages or becomes
full, this correspondence may deteriorate
inevitably. This further worsens the pen-
alty for mis-prediction.
Small files cannot benefit from prefetch-
ing - Since prefetching detects sequential
data accesses within each individual file,
prefetching for a small file barely has a
chance to be activated before reaching the
end of the file. Thus, data blocks in small
files cannot be prefetched.
Inter-file sequentiality cannot not be ex-
ploited - As prefetching cannot across the
boundary of files, sequential data accesses
in multiple files cannot be detected, even
if the data blocks of these files are actually
located on the hard disk continuously.
prefetching at logic file level
Most practical prefetch policies usually detect
access patterns and issue prefetch requests at the
logical file level (Pai, Pulavarty, & Cao, 2004).
Such a design is based on the fact that applica-
tions usually make I/O requests through logic
files, such as reading a file via system call read
(), so their discernable access patterns can be
identified in terms of logic files. For example, in
the Linux kernel, when a file is opened, the logic
offset in the file of each access is tracked. If the
application sequentially accesses data in the file,
prefetching (called readahead mechanism in the
Linux kernel) is activated to speculatively read
data in advance. If the sequential access pattern
changes, prefetching is slowed down or stopped
to avoid loading useless data blocks into memory.
Such a file-level prefetching is widely adopted in
most general-purpose operating systems, such as
FreeBSD and Linux.
Prefetching at logic file level is simple and
portable. For example, the same prefetch policy
can be applied to different file systems and benefit
most applications transparently. However, because
disk data layout information cannot be exploited
at the logic file level, the disk-specific knowl-
edge, such as where the next prefetched block
would be relative to the currently fetched block,
is unknown, which makes estimating prefetching
cost infeasible. Thus, in file-level prefetching, the
effectiveness of prefetching, which is needed as
a feedback to adjust prefetching aggressiveness,
has to be presented in terms of the number of
mis-prefetched blocks rather than a more relevant
metric, the penalty of mis-prefetching. Here we
summarize the limitations of file-level prefetch-
ing as follows.
File system metadata blocks cannot be
prefetched - File system metadata blocks,
such as inode blocks, are usually placed
separately from the file content data
blocks and transparent for applications.
Prefetching at file level is only effective for
the file content data blocks.
prefetching in disk firmware
Modern hard disks are usually equipped with
a large RAM buffer (e.g. 16MB), and the disk
firmware can also apply some simple prefetching
policies to preload data into the disk buffer. For
example, when waiting for disk platters to rotate
to the target position, disk firmware would read
the data blocks beneath the disk head into the disk
buffer with no extra cost. In some cases, as many
as a full track of data blocks can be prefetched.
Such a firmware-level prefetching has many
limitations. First, since this readahead is usually
carried out on each individual track, it cannot
take into consideration the relatively long-term
temporal and spatial locality of blocks across the
Blocks continuously located in a file may
not be continuous on disk - Though file
Search WWH ::




Custom Search