Database Reference
In-Depth Information
From this inequality, you can derive
r
≥ 2
n
2
/q
We leave the algebraic manipulation, which is similar to that in
Example 2.17
,
as an exer-
cise.
Now, let us consider the generalization of the two-pass matrix-multiplication algorithm
that we described in
Section 2.3.9
.
This algorithm too can be generalized so that the first
MapReduce job uses reducers of size greater than 2. The idea is suggested by
Fig. 2.12
. We
may divide the rows and columns of both input matrices
M
and
N
into
g
groups of
n/g
rows
or columns each. The intersections of the groups partition each matrix into
g
2
squares of
n
2
/g
2
elements each.
Figure 2.12
Partitioning matrices into squares for a two-pass MapReduce algorithm
The square of
M
corresponding to set of rows
I
and set of columns
J
combines with the
square of
N
corresponding to set of rows
J
and set of columns
K
. These two squares com-
pute some of the terms that are needed to produce the square of the output matrix
P
that
has set of rows
I
and set of columns
K
. However, these two squares do not compute the full
value of these elements of
P
; rather they produce a contribution to the sum. Other pairs of
squares, one from
M
and one from
N
, contribute to the same square of
P
. These contribu-
tions are suggested in
Fig. 2.12
. There, we see how all the squares of
M
with a fixed value
for set of rows
I
pair with all the squares of
N
that have a fixed value for the set of columns
K
by letting the set
J
vary.
So in the first pass, we compute the products of the square (
I, J
) of
M
with the square (
J,
K
) of
N
, for all
I
,
J
, and
K
. Then, in the second pass, for each
I
and
K
we sum the products
over all possible sets
J
. In more detail, the first MapReduce job does the following.
The Map Function
The keys are triples of sets of rows and/or column numbers (
I, J, K
).
Suppose the element
m
ij
belongs to group of rows
I
and group of columns
J
. Then from