Database Reference
In-Depth Information
After filtering any bad rows out of the data, my next move is to build a compound key from the extracted data
fields via the Set Key Values step. This step creates a combined comb_key key value from Fields 2 and 3, separating
the string values with a dash. (Actually, this is created as a user-defined Java expression. It also creates a comb_value
value field with a value of 1, as shown in Figure 10-9 ).
Figure 10-9. Java expression for Set Key Values step of mapper transformation
These values are then passed to the MapReduce Output step, which assigns the comb_key and comb_value
variables to the mapper output variables key and value, as shown in Figure 10-10 .
Figure 10-10. Output step of mapper transformation
That completes the definition of the mapper transformation that creates a key and value from the incoming data.
As described earlier, the reducer transformation accepts the incoming key / value pair and sorts it, then groups
by key values, sums the value, and finally outputs the results. Figure 10-11 shows the structure of the reducer
transformation. The Input step is the same as for the mapper transformation, but Figure 10-12 provides greater detail
on the Sort Rows step. Here, I define a single field: the sort key field (at the bottom). I also reduce the value in the sort
size field because I didn't have much data and I wanted to save memory.
 
Search WWH ::




Custom Search