Database Reference
In-Depth Information
Fig. 9.3
Decision tree for choosing between various join strategies on the MapReduce framework
coordinator
driver
process
fork
communication
mapper
mapper
mapper
split
reducer
DFS file/chunks
split
split
internal transfer
reducer
DFS input/output
remote transfer
split
split
reducer
mapper
merger
output
output
output
mapper
merger
split
split
split
split
split
reducer
mapper
mapper
mapper
merger
reducer
reducer
Fig. 9.4
An overview of the Map-Reduce-Merge framework
To tackle the limitation of the extra processing requirements for performing join
operations in the MapReduce framework, the Map-Reduce-Merge model [ 103 ]have
been introduced to enable the processing of multiple datasets. Figure 9.4 illustrates
the framework of this model where the map phase transforms an input key/value pair
.k1; v 1/ into a list of intermediate key/value pairs Œ.k2; v 2/. The reduce function
aggregates the list of values Πv 2 associated with k2 and produces a list of values
Πv 3 which is also associated with k2. Note that inputs and outputs of both functions
belong to the same lineage (˛). Another pair of map and reduce functions produce
Search WWH ::




Custom Search