MapReduce with Cassandra - Beginning Apache Cassandra Development

Database Reference

In-Depth Information

job.setOutputFormatClass(ColumnFamilyOutputFormat.class);

7.

Let's have a look at the mapper ( TweetMapper ):

public class TweetMapper extends

Mapper<ByteBuffer, SortedMap<ByteBuffer,

Column>, Text, IntWritable>

{

static final String COLUMN_NAME =

TwitterCassandraJob.COLUMN_NAME;

private final static IntWritable one =

new IntWritable(1);

/* (non-Javadoc)

* @see

org.apache.hadoop.mapreduce.Mapper#map(KEYIN,

VALUEIN, org.apache.hadoop.mapreduce.

Mapper.Context)

*/

public void map(ByteBuffer key,

SortedMap<ByteBuffer, Column> columns,

Context context) throws IOException,

InterruptedException

{

Column column =

columns.get(ByteBufferUtil.bytes(COLUMN_NAME));

String value =

ByteBufferUtil.string(column.value());

context.write(new Text(value), one);

}

It simply reads a specific column (e.g., user from streaming columns) and writes a

count for each column value. We will be using the same reducer ( TweetAggregat-

or ) to perform reduce operations.

Search WWH ::

Custom Search

Home