Developing a MapReduce Application - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

By default, log aggregation is not enabled. In this case, task logs can be retrieved by visit-

ing the node manager's web UI at http:// node-manager-host :8042/logs/userlogs .

It is straightforward to write to these logfiles. Anything written to standard output or

standard error is directed to the relevant logfile. (Of course, in Streaming, standard output

is used for the map or reduce output, so it will not show up in the standard output log.)

In Java, you can write to the task's syslog file if you wish by using the Apache Commons

Logging API (or indeed any logging API that can write to log4j). This is shown in

Example 6-13 .

Example 6-13. An identity mapper that writes to standard output and also uses the Apache

Commons Logging API

import org.apache.commons.logging.Log ;

import org.apache.commons.logging.LogFactory ;

import org.apache.hadoop.mapreduce.Mapper ;

public class LoggingIdentityMapper < KEYIN , VALUEIN , KEYOUT , VALUEOUT >

extends Mapper < KEYIN , VALUEIN , KEYOUT , VALUEOUT > {

private static final Log LOG =

LogFactory . getLog ( LoggingIdentityMapper . class );

@Override

@SuppressWarnings ( "unchecked" )

public void map ( KEYIN key , VALUEIN value , Context context )

throws IOException , InterruptedException {

// Log to stdout file

System . out . println ( "Map key: " + key );

// Log to syslog file

LOG . info ( "Map key: " + key );

if ( LOG . isDebugEnabled ()) {

LOG . debug ( "Map value: " + value );

}

context . write (( KEYOUT ) key , ( VALUEOUT ) value );

}

The default log level is INFO , so DEBUG- level messages do not appear in the syslog task

logfile. However, sometimes you want to see these messages. To enable this, set mapre-

duce.map.log.level or mapreduce.reduce.log.level , as appropriate. For

example, in this case, we could set it for the mapper to see the map values in the log as

follows:

Search WWH ::

Custom Search

Home