Database Reference
In-Depth Information
After confirming that the job has completed, we call the Job 's getCounters() meth-
od, which returns a Counters object encapsulating all the counters for the job. The
Counters class provides various methods for finding the names and values of counters.
We use the findCounter() method, which takes an enum to find the number of re-
cords that had a missing temperature field and also the total number of records processed
(from a built-in counter).
Finally, we print the proportion of records that had a missing temperature field. Here's
what we get for the whole weather dataset:
% hadoop jar hadoop-examples.jar MissingTemperatureFields
job_1410450250506_0007
Records with missing temperature fields: 5.47%
User-Defined Streaming Counters
A Streaming MapReduce program can increment counters by sending a specially format-
ted line to the standard error stream, which is co-opted as a control channel in this case.
The line must have the following format:
reporter:counter: group , counter , amount
This snippet in Python shows how to increment the “Missing” counter in the “Temperat-
ure” group by 1:
sys . stderr . write ( "reporter:counter:Temperature,Missing,1 \n " )
In a similar way, a status message may be sent with a line formatted like this:
reporter:status: message
Search WWH ::




Custom Search