Information Technology Reference
In-Depth Information
Otherwise, the length of the queue equals
k , it means that the size of the prefetching
buffer has reached the maximum limit.
Discard the oldest page at the head of the
queue, and add the accepted page to the
queue tail;
file. The memory node will record and analyze
both of the file ID and offset in the historical trace.
There is another problem in the access trace.
When multiple applications are accessing the file
system in parallel, the access pattern of one appli-
cation can be interblended by other applications.
For example, an application A may read block
A 1 , A 2 , A 3 , and application B will read B 1 , B 2 , B 3 .
While their access trace may be considered as A 1 ,
A 2 , A 3 , B 1 , B 2 , B 3 , or A 1 , B 1 , A 2 , B 2 , A 3 , B 3 ,. We can
also observe this occasion in the real web server
traces described in last Section. The accesses on
sectors 76120651, 76120707 and 76120735 can
be taken as a pattern, but there are many outlying
traces between every two of them. Therefore, the
prefetching algorithm should recognize each of the
patterns in a mixed access sequence, as explained
later in the prefetching algorithm.
When a page in prefetching buffer is actu-
ally accessed, it will be read into the file
system cache, then remove it from the
prefetching buffer.
The parameter k is a key factor for the prefetch-
ing buffer and it is related to the free physical
memory of the system. If a user node lacks physical
memory, its k should be set to a smaller value to
minimize the memory waste, while k should be set
larger to hold more prefetched pages when the free
memory is larger. The relationship between k and
free memory will be studied in the simulations.
Trace Recording Process
Access Trace
When a memory node is recording the access
trace of the user node, it maintains a sequence of
file IDs and offsets which is ordered by their ac-
cess timestamp. The sequence can be denoted as
S
In our prefetching policy, the memory node
selects the memory pages to be pushed through
the access patterns of a user node, which can be
analyzed from the historical traces of the user. The
memory node should also record current access
traces for future analysis. Every “get page” opera-
tion from a user node contains the ID of the disk
block corresponding to a desired memory page, it
seems appropriate to record each disk block ID as
historical trace data and analyze request patterns
from it. Unfortunately, in most file systems, a file
corresponds to a certain number of disk blocks;
while their relationship may be changed at any
moment. This means that a block may belong to
different files in different time once the file was
moved or deleted. Therefore, instead of disk block
ID, we consider a file ID and an offset in the file
for trace recording. When a user node gets a page
from a memory node, it also sends the file ID
(supported by the file system, such as the inode
in Unix-like file systems) and offset within the
(
)
= 1
o o o
o n
, where each o i
1,
n
,
,
, ...,
2
3
i
is a combination of file ID and offset.
Each sequence should be partitioned into some
small ones when recording. If the difference of
access time for two neighboring items o i and o i +1
in a sequence is longer than a threshold t , we can
split the sequence into two halves between the
neighboring items. The rationale for splitting is
that if the memory node pushes o i +1 when o i is ac-
cessed, o i +1 will stay in the prefetching buffer for
a long time and may possibly be discarded by the
user node before it is actually accessed. In other
words, prefetching for o i +1 is useless because its
intended access time is too late. Therefore, we
can partition a sequence into small sequences
through a parameter t and find access patterns in
each small sequence. We will call a partitioned
small sequence a “trace item”. Indeed, the selec-
Search WWH ::




Custom Search