Information Technology Reference
In-Depth Information
multi-level memory hierarchy exhibits locality of reference if it uses its higher-level memory
units much less often that its lower-level units. Formally, we say that a pebbling strategy
P
is
c -local if T (
P
, G ) satisfies the following inequality:
L
T ( L )
l
cT ( 2 )
1
t l ·
( p , b , G ,
P
)
( s L− 1 , G )
l = 1
The definition of a c -local pebbling strategy is illustrated by the results for matrix multipli-
cation in the standard MHG when block I/O is not used. Let k be the largest integer such that
s k
3 n 2 .FromTheorem 11.5.3 for matrix-matrix multiplication, we see that there exists an
optimal pebbling if
k
L
t l
b l s l− 1 +
t l
nb l
c
(11.1)
l = 2
l = k + 1
for some c > 0since T ( 2 1 ( S , G )=Θ( n 3 ) .
We noted in Section 11.1.2 that the imbalance between the computation and I/O times
for matrix multiplication is becoming ever moreseriouswiththeadvanceoftechnology.We
re-examine this issue in light of the above condition. Consider the case in which k + 1 = L ;
that is, the highest-level memory is used to store the arguments and results of a computation.
In this case the second term on the left-hand side of ( 11.1 ) is a relative measure of the time
to bring data into lower-level memories from the highest-level memory. It is negligible when
nb L is large. For example, if t L = 2,000,000 and b L = 10,000, say, then n must be at least
200, a modest-sized matrix. The first term on the left-hand side reflects the number of t imes
data moves between the levels of the hierarchy holding the data. It is small when b l s l− 1
is large by comparison with t l for 2
l
k , a condition that is not hard to meet. For
example, if s l− 1 = 32
10 6 (about 4 Mbytes) and b l = 1,000, then t l must be less than
about 45, a condition that certainly applies to low level memories such as today's random-
access memories. Problems 11.15 and 11.16 provide opportunities to explore this issue with
the FFT and convolution.
×
11.8 RAM-Based I/O Models
The MHG assumes that computations are done by pebbling the vertices of a directed acyclic
graph. That is, it assumes that computations are straight-line. While the best known algo-
rithms for the problems studied earlier in this chapter are straight-line, some problems are not
efficiently done in a straight-line fashion. For example, binary search in a tree that holds a set
of keys in sorted order (see Section 11.9.1 ) is much better suited to data-dependent compu-
tation of the kind allowed by an unrestricted RAM. Similarly, the merging of two sorted lists
can be done more efficiently on a RAM than with a straight-line program. For this reason
we consider RAM-based I/O models, specifically the block-transfer model and the hierarchical
memory model.
11.8.1 The Block-Transfer Model
The block-transfer model is a two-level I/O model that generalizes the red-blue pebble game
to RAM-based computations by allowing programs that are not straight-line.
Search WWH ::




Custom Search