Database Reference
In-Depth Information
The Mapper function gets called once with each of these lines. It doesn't
matter in which order they get called, or even whether the three calls are
performed on the same machine. The output of the three Mapper calls look
like:
1: [{tomorrow, 3}, {and, 2}]
2: [{creeps, 1}, {in, 1}, {this, 1}, {petty, 1},
{pace, 1},
{from, 1}, {day, 2}, {to, 1}]
3: [{to, 1}, {the, 1}, {last, 1}, {syllable, 1}, {of,
1},
{recorded, 1}, {time, 1}]
Next, the Shuffle phase goes to work on the Mapper's output and produces
the following:
{and, [2]}
{creeps, [1]}
{day, [2]}
{from, [1]}
{in, [1]}
{last, [1]}
{of, [1]}
{pace, [1]}
{petty, [1]}
{recorded, [1]}
{syllable, [1]}
{the, [1]}
{this, [1]}
{time, [1]}
{to, [1, 1]}
{tomorrow, [3]}
The shuffler output is mostly uninteresting except for to , which is the only
word to appear on more than one line. The to entry contains a list of two
elements, one for each time it appeared in the Mapper's output.
Finally, this data is passed to the Reducer, which takes each word and sums
up the totals and produces a count for each word. The results follow:
Search WWH ::




Custom Search