Database Reference
In-Depth Information
for a large matrix. However, it is possible that i , j , and k are implicit in the position of a
matrix element in the file that represents it, rather than written explicitly with the element
itself. In that case, the Map function will have to be designed to construct the I , J , and K
components of tuples from the position of the data.
The product MN is almost a natural join followed by grouping and aggregation. That is,
the natural join of M ( I, J, V ) and N ( J, K, W ), having only attribute J in common, would
produce tuples ( i, j, k, v, w ) from each tuple ( i, j, v ) in M and tuple ( j, k, w ) in N . This five-
component tuple represents the pair of matrix elements ( m ij , n jk ). What we want instead
is the product of these elements, that is, the four-component tuple ( i, j, k, v × w ), because
that represents the product m ij n jk . Once we have this relation as the result of one MapRe-
duce operation, we can perform grouping and aggregation, with I and K as the grouping
attributes and the sum of V × W as the aggregation. That is, we can implement matrix mul-
tiplication as the cascade of two MapReduce operations, as follows. First:
x For each matrix element m ij , produce the key value pair ( j, ( M, i, m ij )). Likewise, for
each matrix element n jk , produce the key value pair (( j, ( N, k, n jk )). Note that M and N in
the values are not the matrices themselves. Rather they are names of the matrices or (as we
mentioned for the similar Map function used for natural join) better, a bit indicating wheth-
er the element comes from M or N .
The Reduce Function For each key j , examine its list of associated values. For each value
that comes from M , say ( M, i, m ij )), and each value that comes from N , say ( N, k, n jk )),
produce a key-value pair with key equal to ( i, k ) and value equal to the product of these
elements, m ij n jk .
Now, we perform a grouping and aggregation by another MapReduce operation.
x This function is just the identity. That is, for every input element with key ( i, k ) and
value v , produce exactly this key-value pair.
The Reduce Function For each key ( i, k ), produce the sum of the list of values associated
with this key. The result is a pair (( i, k ) , v ), where v is the value of the element in row i and
column k of the matrix P = MN .
2.3.10
Matrix Multiplication with One MapReduce Step
There often is more than one way to use MapReduce to solve a problem. You may wish to
use only a single MapReduce pass to perform matrix multiplication P = MN . 5 It is possible
to do so if we put more work into the two functions. Start by using the Map function to
create the sets of matrix elements that are needed to compute each element of the answer P .
Notice that an element of M or N contributes to many elements of the result, so one input
Search WWH ::




Custom Search