Push-Based Prefetching in Remote Memory Sharing System - Cloud, Grid and High Performance Computing: Emerging Applications

Information Technology Reference

In-Depth Information

•

Otherwise, the length of the queue equals

k , it means that the size of the prefetching

buffer has reached the maximum limit.

Discard the oldest page at the head of the

queue, and add the accepted page to the

queue tail;

file. The memory node will record and analyze

both of the file ID and offset in the historical trace.

There is another problem in the access trace.

When multiple applications are accessing the file

system in parallel, the access pattern of one appli-

cation can be interblended by other applications.

For example, an application A may read block

A 1 , A 2 , A 3 , and application B will read B 1 , B 2 , B 3 .

While their access trace may be considered as A 1 ,

A 2 , A 3 , B 1 , B 2 , B 3 , or A 1 , B 1 , A 2 , B 2 , A 3 , B 3 ,. We can

also observe this occasion in the real web server

traces described in last Section. The accesses on

sectors 76120651, 76120707 and 76120735 can

be taken as a pattern, but there are many outlying

traces between every two of them. Therefore, the

prefetching algorithm should recognize each of the

patterns in a mixed access sequence, as explained

later in the prefetching algorithm.

•

When a page in prefetching buffer is actu-

ally accessed, it will be read into the file

system cache, then remove it from the

prefetching buffer.

The parameter k is a key factor for the prefetch-

ing buffer and it is related to the free physical

memory of the system. If a user node lacks physical

memory, its k should be set to a smaller value to

minimize the memory waste, while k should be set

larger to hold more prefetched pages when the free

memory is larger. The relationship between k and

free memory will be studied in the simulations.

Trace Recording Process

Access Trace

When a memory node is recording the access

trace of the user node, it maintains a sequence of

file IDs and offsets which is ordered by their ac-

cess timestamp. The sequence can be denoted as

In our prefetching policy, the memory node

selects the memory pages to be pushed through

the access patterns of a user node, which can be

analyzed from the historical traces of the user. The

memory node should also record current access

traces for future analysis. Every “get page” opera-

tion from a user node contains the ID of the disk

block corresponding to a desired memory page, it

seems appropriate to record each disk block ID as

historical trace data and analyze request patterns

from it. Unfortunately, in most file systems, a file

corresponds to a certain number of disk blocks;

while their relationship may be changed at any

moment. This means that a block may belong to

different files in different time once the file was

moved or deleted. Therefore, instead of disk block

ID, we consider a file ID and an offset in the file

for trace recording. When a user node gets a page

from a memory node, it also sends the file ID

(supported by the file system, such as the inode

in Unix-like file systems) and offset within the

(

)

∈

= 1

o o o

o n

, where each o i

, ...,

is a combination of file ID and offset.

Each sequence should be partitioned into some

small ones when recording. If the difference of

access time for two neighboring items o i and o i +1

in a sequence is longer than a threshold t , we can

split the sequence into two halves between the

neighboring items. The rationale for splitting is

that if the memory node pushes o i +1 when o i is ac-

cessed, o i +1 will stay in the prefetching buffer for

a long time and may possibly be discarded by the

user node before it is actually accessed. In other

words, prefetching for o i +1 is useless because its

intended access time is too late. Therefore, we

can partition a sequence into small sequences

through a parameter t and find access patterns in

each small sequence. We will call a partitioned

small sequence a “trace item”. Indeed, the selec-

Cloud, Grid and High Performance Computing: Emerging Applications

Search WWH ::

Custom Search

Home