Databases Reference
In-Depth Information
Source
Degree
Destinations
A
3
B
B
2
A
(a) Representation of M
11
connecting A and B to A and B
Source
Degree
Destinations
C
1
A
D
2
B
(b) Representation of M
12
connecting C and D to A and B
Source
Degree
Destinations
A
3
C, D
B
2
D
(c) Representation of M
21
connecting A and B to C and D
Source
Degree
Destinations
D
2
C
(d) Representation of M
22
connecting C and D to C and D
Figure 5.14: Sparse representation of the blocks of a matrix
The work reading M and v is thus the same as for the algorithm of Sec-
tion 5.2.3, but the advantage of this approach is that each Map task can combine
all the terms for the portion v
′
i
for which it is exclusively responsible. In other
words, the Reduce tasks have nothing to do but to concatenate the pieces of v
′
received from the k Map tasks.
We can extend this idea to an environment in which map-reduce is not used.
Suppose we have a single processor, with M and v stored on its disk, using the
same sparse representation for M that we have discussed. We can first simulate
the first Map task, the one that uses blocks M
11
through M
1k
and all of v to
compute v
′
1
. Then we simulate the second Map task, reading M
21
through M
2k
and all of v to compute v
′
2
, and so on. As for the previous algorithms, we thus
read M once and v k times. We can make k as small as possible, subject to the
constraint that there is enough main memory to store 1/kth of v and 1/kth of
v
′
, along with as small a portion of M as we can read from disk (typically, one
disk block).