Application of both Temporal and Spatial Localities in the Management of Kernel Buffer Cache - Advanced Operating Systems and Kernel Applications

Information Technology Reference

In-Depth Information

reclaimed anonymous pages in the hope that they

would be efficiently swapped-in in the same order.

However, the data access pattern in SMM foils

the system effort. The swap-in accesses of the

vector arrays recording the positions of elements

in a matrix turn into random accesses, while the

elements of matrix elements are still sequentially

accessed. This explains why DULO can signifi-

cantly reduce the execution times of the program

(by up to 38.6%). This is because DULO detects

the random pages in the vector array and caches

them with a higher priority. Because the matrix

is a sparse one, the vector array cannot obtain

sufficiently frequent reuses to allow the original

kernel to keep them from being paged out. In addi-

tion, the similar execution times between the two

kernels when there is enough memory (exceeding

424MB) to hold the working set shown in the figure

suggest that DULO's overhead is small.

There have been many other techniques to control

the data placement on disk (Arpaci-Dusseau et al.

2003; Black et al. 1991) or reorganize selected disk

blocks (Hsu el al. 2003), so that related objects

are clustered and the accesses to them become

more sequential. Traxtent-aware file system ex-

cludes track boundary block from being allocated

for better disk sequential access performance

(Schindler et al. 2002). The effort on improving

access sequentiality through statically arranging

data layout on the disk is effective only when the

actually accesses take place in the assumed order.

If not or the access order changes from time to

time, many random accesses can still occur.

As the techniques focusing only on the disk

alone cannot fully solve the issue, another com-

plimentary effort, represented by DULO, is to

expose the data layout information to the upper-

lever software such as the buffer cache manage-

ment module in the OS kernel, so that they can

leverage the information in their policies for a

higher I/O throughput. Besides DULO, DiskSeen

is another example of such effort (Ding et al.

2007). DiskSeen improves the effectiveness of

prefetching by using the disk layout knowledge

to find the on-disk data access sequences. In ad-

dition to the conventional file-level prefetching,

the disk-level prefetching provides substantially

higher I/O performance for many patterns of ac-

cesses, especially for access of a large number

of small files. It is noted that the two efforts are

complementary and synergistic.

While statically improvement data layout on

the disk provides the opportunity of long sequence

of data access, leveraging the layout information

in the upper-level software can maximize the

performance potential of sequential access and

minimize the performance penalty incurred by

access random data.

We believe that exposing more detailed

information on the storage system, such as the

configuration of disk array, the data layout on a

disk, and buffer cache size on the storage control-

ler, to the various software layers of the I/O stack,

reSearch on improVing and

expoSing on-diSK layout for

upper-leVel SoftwareS

We know that the disk head seek time far domi-

nates I/O data transfer time, and the efficiency of

accessing sequential data on the disk can be one

order of magnitude higher than that of accessing

of random data. As the hard disk has been and is

expected to continue to be the mainstream on-line

storage device in the foreseeable future, efforts

on making sure on-disk data are sequentially

accessed are critical to maintain a high I/O per-

formance. Exposing information from the lower

layers up for better utilization of hard disk is an

active research topic.

Most of the existing work focuses on using

disk-specific knowledge for improving data place-

ments on disk that facilitate the efficient servicing

of future requests. For example, Fast File System

(FFS) and its variants allocate related data and

metadata into the same cylinder group to minimize

seeks (Mckusick et al. 1994; Ganger et al. 1997).

Advanced Operating Systems and Kernel Applications

Search WWH ::

Custom Search

Home