Database Reference
In-Depth Information
job.setOutputFormatClass(ColumnFamilyOutputFormat.class);
7.
Let's have a look at the mapper ( TweetMapper ):
public class TweetMapper extends
Mapper<ByteBuffer, SortedMap<ByteBuffer,
Column>, Text, IntWritable>
{
static final String COLUMN_NAME =
TwitterCassandraJob.COLUMN_NAME;
private final static IntWritable one =
new IntWritable(1);
/* (non-Javadoc)
* @see
org.apache.hadoop.mapreduce.Mapper#map(KEYIN,
VALUEIN, org.apache.hadoop.mapreduce.
Mapper.Context)
*/
public void map(ByteBuffer key,
SortedMap<ByteBuffer, Column> columns,
Context context) throws IOException,
InterruptedException
{
Column column =
columns.get(ByteBufferUtil.bytes(COLUMN_NAME));
String value =
ByteBufferUtil.string(column.value());
context.write(new Text(value), one);
}
}
It simply reads a specific column (e.g., user from streaming columns) and writes a
count for each column value. We will be using the same reducer ( TweetAggregat-
or ) to perform reduce operations.
Search WWH ::




Custom Search