Information Technology Reference
In-Depth Information
Table 2. Request Size (blocks) of workloads on the stock Linux kernel and kernel with DiskSeen
Workload name
Linux 2.6.11
DiskSeeen - 1 st Run
DiskSeen - 2 nd Run
strided
2
2
59.54
reversed
1
1
29.79
CVS
2.73
5.98
8.42
diff
2.88
3.94
5.4
grep
2.86
82.26
97.41
TPC-H (Q4)
6.49
6.89
10.37
TPC-H
activated. In contrast, with DiskSeen, sequential
disk accesses across file boundary can be detected
at the disk level by the sequence-based prefetch-
ing during the first runs. As a result, CVS and diff
have a reduction of execution times by 16% and
18%, respectively. The second runs of them can
further reduce the times by another 19% and 43%,
because accesses to non-continuous data blocks
are identified and prefetched as well.
In TPC-H workloads, Query 4 performs a merge-
join against table orders and table lineitem . It se-
quentially searches table orders for orders placed
in a specific time frame. For each record, the query
searches for the matched records in table lineitem
by referring to an index file.
During the first run, DiskSeen can identify
sequences for accesses to table lineitem , which
is created by appending records in the order time.
In the second run, history-aware prefetching can
further exploit history trails for disk accesses to
the index file, and DiskSeen achieve a 26% re-
duction of execution time compared to the stock
Linux kernel.
Grep
Different from CVS and diff , which gains signifi-
cant performance improvement due to alternate
accesses of two remote disk regions, grep only
searches a local directory but also exhibits substan-
tial performance improvement, a 20% reduction
in its execution time.
In EXT2/EXT2, disk is segmented into mul-
tiple 128MB cylinder groups. In each cylinder
group, inode blocks are grouped at the beginning
and followed by file data blocks. Before a file is
accessed, its inode must be inspected first, so ac-
cessing a large number of small files causes the
disk head to wildly move between disk regions
containing file content blocks and regions con-
taining metadata blocks. Since the inode blocks
in one cylinder group are laid out continuously
at the beginning, sequence-based prefetching in
DiskSeen can effectively prefetch these inode
blocks into memory, thus most of the disk head
movements are removed, which explains the 20%
performance improvement.
Disk Request Size
The hard disk performance is directly affected by
the size of requests received on the disk. Generally
speaking, the larger disk requests are, the more
efficient disk performance is. Thus, we also exam-
ined the size of disk requests for each workload.
To obtain the request sizes, we modify the Linux
kernel to monitor READ/WRITE commands is-
sued to the disk driver and trace the sizes of disk
requests. Table 2 shows the average size of all the
requests during the executions of benchmarks.
As shown in the table, DiskSeen significantly
increases the average request sizes in most cases,
which explains their respective execution reduc-
tions shown in Table 1. For example, the first
 
Search WWH ::




Custom Search