Information Technology Reference
In-Depth Information
multi-level memory hierarchy exhibits locality of reference if it uses its higher-level memory
units much less often that its lower-level units. Formally, we say that a pebbling strategy
P
is
c
-local
if
T
(
P
,
G
)
satisfies the following inequality:
L
T
(
L
)
l
cT
(
2
)
1
t
l
·
(
p
,
b
,
G
,
P
)
≤
(
s
L−
1
,
G
)
l
=
1
The definition of a
c
-local pebbling strategy is illustrated by the results for matrix multipli-
cation in the standard MHG when block I/O is not used. Let
k
be the largest integer such that
s
k
≤
3
n
2
.FromTheorem
11.5.3
for matrix-matrix multiplication, we see that there exists an
optimal pebbling if
k
L
t
l
b
l
√
s
l−
1
+
t
l
nb
l
≤
c
∗
(11.1)
l
=
2
l
=
k
+
1
for some
c
∗
>
0since
T
(
2
1
(
S
,
G
)=Θ(
n
3
)
.
We noted in Section
11.1.2
that the imbalance between the computation and I/O times
for matrix multiplication is becoming ever moreseriouswiththeadvanceoftechnology.We
re-examine this issue in light of the above condition. Consider the case in which
k
+
1
=
L
;
that is, the highest-level memory is used to store the arguments and results of a computation.
In this case the second term on the left-hand side of (
11.1
) is a relative measure of the time
to bring data into lower-level memories from the highest-level memory. It is negligible when
nb
L
is large. For example, if
t
L
=
2,000,000 and
b
L
=
10,000, say, then
n
must be at least
200, a modest-sized matrix. The first term on the left-hand side reflects the number of t
imes
data moves between the levels of the hierarchy holding the data. It is small when
b
l
√
s
l−
1
is large by comparison with
t
l
for 2
≤
l
≤
k
, a condition that is not hard to meet. For
example, if
s
l−
1
=
32
10
6
(about 4 Mbytes) and
b
l
=
1,000, then
t
l
must be less than
about 45, a condition that certainly applies to low level memories such as today's random-
access memories. Problems
11.15
and
11.16
provide opportunities to explore this issue with
the FFT and convolution.
×
11.8 RAM-Based I/O Models
The MHG assumes that computations are done by pebbling the vertices of a directed acyclic
graph. That is, it assumes that computations are straight-line. While the best known algo-
rithms for the problems studied earlier in this chapter are straight-line, some problems are not
efficiently done in a straight-line fashion. For example, binary search in a tree that holds a set
of keys in sorted order (see Section
11.9.1
) is much better suited to data-dependent compu-
tation of the kind allowed by an unrestricted RAM. Similarly, the merging of two sorted lists
can be done more efficiently on a RAM than with a straight-line program. For this reason
we consider RAM-based I/O models, specifically the block-transfer model and the hierarchical
memory model.
11.8.1 The Block-Transfer Model
The block-transfer model is a two-level I/O model that generalizes the red-blue pebble game
to RAM-based computations by allowing programs that are not straight-line.
Search WWH ::
Custom Search