Database Reference
In-Depth Information
The last section of the output, titled “Counters,” shows the statistics that Hadoop gener-
ates for each job it runs. These are very useful for checking whether the amount of data
processed is what you expected. For example, we can follow the number of records that
went through the system: five map input records produced five map output records (since
the mapper emitted one output record for each valid input record), then five reduce input
records in two groups (one for each unique key) produced two reduce output records.
The output was written to the output directory, which contains one output file per reducer.
The job had a single reducer, so we find a single file, named part-r-00000 :
% cat output/part-r-00000
1949 111
1950 22
This result is the same as when we went through it by hand earlier. We interpret this as
saying that the maximum temperature recorded in 1949 was 11.1°C, and in 1950 it was
2.2°C.
Search WWH ::




Custom Search