Database Reference
In-Depth Information
Figure 9-2. Inner join of two datasets
If the join is performed by the mapper it is called a map-side join , whereas if it is per-
formed by the reducer it is called a reduce-side join .
If both datasets are too large for either to be copied to each node in the cluster, we can still
join them using MapReduce with a map-side or reduce-side join, depending on how the
data is structured. One common example of this case is a user database and a log of some
user activity (such as access logs). For a popular service, it is not feasible to distribute the
user database (or the logs) to all the MapReduce nodes.
Search WWH ::




Custom Search