MapReduce and the New Software Stack - Mining of Massive Datasets

Database Reference

In-Depth Information

(2.2)

The final step, which is step (5), is to divide both sides of Equation 2.2 by p , the number

of inputs. As a result, the left side, which is is equal to the replication rate, and the

right side becomes p/q . That is, we have proved the lower bound on r :

r ≥ p/q

As claimed, this shows that the family of algorithms from Example 2.15 all have a replica-

tion rate that is at most twice the lowest possible replication rate.

□

2.6.7

Case Study: Matrix Multiplication

In this section we shall apply the lower-bound technique to one-pass matrixmultiplication

algorithms. We saw one such algorithm in Section 2.3.10 , but that is only an extreme case

of a family of possible algorithms. In particular, for that algorithm, a reducer corresponds

to a single element of the output matrix. Just as we grouped inputs in the similarity-join

problem to reduce the communication at the expense of a larger reducer size, we can group

rows and columns of the two input matrices into bands. Each pair consisting of a band of

rows of the first matrix and a band of columns of the second matrix is used by one reducer

to produce a square of elements of the output matrix. An example is suggested by Fig. 2.11 .

Figure 2.11 Dividing matrices into bands to reduce communication

In more detail, suppose we want to compute MN = P , and all three matrices are n × n .

Group the rows of M into g bands of n/g rows each, and group the columns of N into g

bands of n/g columns each. This grouping is as suggested by Fig. 2.11 . Keys correspond to

two groups (bands), one from M and one from N .

The Map Function For each element of M , the Map function generates g key-value pairs.

The value in each case is the element itself, together with its row and column number so

it can be identified by the Reduce function. The key is the group to which the element be-

longs, paired with any of the groups of the matrix N . Similarly, for each element of N , the

Search WWH ::

Custom Search

Home