Information Technology Reference
In-Depth Information
11.1.1 Playing the Red-Blue Pebble Game
The rules for the red-blue pebble game are illustrated by the eight-input FFT graph shown in
Fig. 11.1 .If S = 3 red pebbles are available to pebble this graph (at least S = 4 pebbles are
needed in the one-pebble game), a pebbling strategy that keeps the number of I/O operations
small is based on the pebbling of sub-FFT graphs on two inputs. Three such sub-FFT sub-
graphs are shown by heavy lines in Fig. 11.1 , one at each level of the FFT graph. This pebbling
strategy uses three red pebbles to place blue pebbles on the outputs of each of the four lowest-
level sub-FFT graphs on two inputs, those whose outputs are second-level vertices of the full
FFT graph. (Thus, eight blue pebbles are used.) Shown on a second-level sub-FFT graph are
three red pebbles at the time when a pebble has just been placed on the first of the two outputs
of this sub-FFT graph. This strategy performs two I/O operations for each vertex except for
input and output vertices. A small savings is possible if, after pebbling the last sub-FFT graph
at one level, we immediately pebble the last sub-FFT graph at the next level.
11.1.2 Balanced Computer Systems
A balanced computer system is one in which no computational unit or data channel becomes
saturated before any other. The results in this chapter can be used to analyze balance. To
illustrate this point, we examine a serial computer system consisting of a CPU with a random-
access memory and a disk storage unit. Such a system is balanced for a particular problem if
the time used for I/O is comparable to the time used for computation.
As shown in Section 11.5.2 , multiplying two n
n matrices with a variant of the classical
matrix multiplication algorithm requires a numb er of computations proportional to n 3 and a
×
number of I/O operations proportional to n 3 / S ,where S is the number of red pebbles or
the capacity of the random-access memory. Let t 0 and t 1 be the times for one co m putation
and I/O operation, respectively. Then the system is balanced when t 0 n 3
t 1 n 3 / S .Letthe
computational and I/O capacities , C comp and C I / O , be the rates at which the CPU and disk
can compute and exchange data, respectively; that is, C comp = 1 /t 0 and C I / O = 1 /t 1 .Thus,
balance is achieved when the following condition holds:
S
C comp
C I / O
From this condition we see that if through technological advance the ratio C comp /C I / O in-
creases by a factor β , then for the system to be balanced the storage capacity of the system, S ,
must increase by a factor β 2 .
Hennessy and Patterson [ 132 , p. 427] observe that CPU speed is increasing between 50%
and 100% per year while that of disks is increasing at a steady 7% per year. Thus, if the ratio
C comp /C I / O for our simple computer system grows by a factor of 50 / 7
7peryear,then
S must grow by about a factor of 49 per year to maintain balance. To the extent that matrix
multiplication is typical of the type of computing to be done and that computers have two-
level memories, a crisis is looming in the computer industry! Fortunately, multi-level memory
hierarchies are being introduced to help avoid this crisis.
As bad as the situation is for matrix multiplication, it is much worse for the Fourier trans-
form and sorting. For each of these problems the number of computation and I/O operations
is proportional to n log 2 n and n log 2 n/ log 2 S , respectively (see Section 11.5.3 ). Thus, bal-
 
Search WWH ::




Custom Search