Exploiting Disk Layout and Block Access History for I/O Prefetch - Advanced Operating Systems and Kernel Applications - page 215

Information Technology Reference

In-Depth Information

Table 2. Request Size (blocks) of workloads on the stock Linux kernel and kernel with DiskSeen

Workload name

Linux 2.6.11

DiskSeeen - 1 st Run

DiskSeen - 2 nd Run

strided

2

2

59.54

reversed

1

1

29.79

CVS

2.73

5.98

8.42

diff

2.88

3.94

5.4

grep

2.86

82.26

97.41

TPC-H (Q4)

6.49

6.89

10.37

TPC-H

activated. In contrast, with DiskSeen, sequential

disk accesses across file boundary can be detected

at the disk level by the sequence-based prefetch-

ing during the first runs. As a result, CVS and diff

have a reduction of execution times by 16% and

18%, respectively. The second runs of them can

further reduce the times by another 19% and 43%,

because accesses to non-continuous data blocks

are identified and prefetched as well.

In TPC-H workloads, Query 4 performs a merge-

join against table orders and table lineitem . It se-

quentially searches table orders for orders placed

in a specific time frame. For each record, the query

searches for the matched records in table lineitem

by referring to an index file.

During the first run, DiskSeen can identify

sequences for accesses to table lineitem , which

is created by appending records in the order time.

In the second run, history-aware prefetching can

further exploit history trails for disk accesses to

the index file, and DiskSeen achieve a 26% re-

duction of execution time compared to the stock

Linux kernel.

Grep

Different from CVS and diff , which gains signifi-

cant performance improvement due to alternate

accesses of two remote disk regions, grep only

searches a local directory but also exhibits substan-

tial performance improvement, a 20% reduction

in its execution time.

In EXT2/EXT2, disk is segmented into mul-

tiple 128MB cylinder groups. In each cylinder

group, inode blocks are grouped at the beginning

and followed by file data blocks. Before a file is

accessed, its inode must be inspected first, so ac-

cessing a large number of small files causes the

disk head to wildly move between disk regions

containing file content blocks and regions con-

taining metadata blocks. Since the inode blocks

in one cylinder group are laid out continuously

at the beginning, sequence-based prefetching in

DiskSeen can effectively prefetch these inode

blocks into memory, thus most of the disk head

movements are removed, which explains the 20%

performance improvement.

Disk Request Size

The hard disk performance is directly affected by

the size of requests received on the disk. Generally

speaking, the larger disk requests are, the more

efficient disk performance is. Thus, we also exam-

ined the size of disk requests for each workload.

To obtain the request sizes, we modify the Linux

kernel to monitor READ/WRITE commands is-

sued to the disk driver and trace the sizes of disk

requests. Table 2 shows the average size of all the

requests during the executions of benchmarks.

As shown in the table, DiskSeen significantly

increases the average request sizes in most cases,

which explains their respective execution reduc-

tions shown in Table 1. For example, the first

Next Page

Advanced Operating Systems and Kernel Applications

Search WWH ::

Custom Search

Home