Database Reference
In-Depth Information
data, which exhibits the highest ARRs, the average observed ARR was still
2.86%, 3.3 times higher than 0.88%.
With this cluster and disk failure data, various projections can be devel-
oped. First, integrators are projected to deliver petascale computers according
to the long-standing trends shown on top500.org; 9 that is, the aggregate com-
pute performance will double every year. Second, integrators will continue to
build balanced systems; that is, storage size and bandwidth will scale linearly
with memory size and total compute power. 30 As a baseline, projections model
the Jaguar system at Oak Ridge National Laboratory after it is expanded to a
petaFLOP system having approximately 11,000 processor sockets (dual-core
Opterons), 45 TB of main memory, and a storage bandwidth of 55 GB/s. 35
While the architecture of other 2008 petascale machines, such as LANL's
Roadrunner, 36 differs from Jaguar in its use of hybrid nodes employing vec-
tor/graphics coprocessors, predictions for its failure rates are little different
from Jaguar, so they are not included in the following graphs.
Further, individual disk bandwidth will grow at a rate of about 20% per
year, which is significantly slower than the 100% per year growth rate that
top500.org predicts. In order to keep up, the number of disk drives in a system
will have to increase at an impressive rate. Figure 2.11a projects the number
of drives in a system necessary simply to maintain balance. The figure shows
that, if current technology trends continue, by 2018 a computing system at
the top of top500.org chart will need to have more than 800,000 disk drives.
Managing this number of independent disk drives, much less delivering all of
their bandwidth to an application, will be extremely challenging for storage
system designers.
Second, disk drive capacity will keep growing by about 50% per year,
thereby continuously increasing the amount of work needed to reconstruct
a failed drive, and the time needed to complete this reconstruction. While
other trends, such as decrease in physical size (diameter) of drives, will help
to limit the increase in reconstruction time, these are single-step decreases
limited by the poorer cost-effectiveness of the smaller disks. Overall it is
×10 ??
300
9
8
250
7
200
6
5
150
4
100
3
2
50
1
0
0
2006
2008
2010
2012
Ye a r
2014
2016
2018
2006
2008
2010
2012
Ye a r
2014
2016
2018
Figure 2.11 (a) Number of disk drives in the largest of future systems.
(b) Number of concurrent reconstructions in the largest of future systems.
Search WWH ::




Custom Search