Database Reference
In-Depth Information
public static void main ( String [] args ) throws Exception {
int exitCode = ToolRunner . run ( HBaseConfiguration . create (),
new SimpleRowCounter (), args );
System . exit ( exitCode );
}
}
The RowCounterMapper nested class is a subclass of the HBase TableMapper ab-
stract class, a specialization of org.apache.hadoop.mapreduce.Mapper that
sets the map input types passed by TableInputFormat . Input keys are Immut-
ableBytesWritable objects (row keys), and values are Result objects (row results
from a scan). Since this job counts rows and does not emit any output from the map, we
just increment Counters.ROWS by 1 for every row we see.
In the run() method, we create a scan object that is used to configure the job by invok-
ing the TableMapReduceUtil.initTableMapJob() utility method, which,
among other things (such as setting the map class to use), sets the input format to
TableInputFormat .
Notice how we set a filter, an instance of FirstKeyOnlyFilter , on the scan. This fil-
ter instructs the server to short-circuit when running server-side, populating the Result
object in the mapper with only the first cell in each row. Since the mapper ignores the cell
values, this is a useful optimization.
TIP
You can also find the number of rows in a table by typing count ' tablename ' in the HBase shell.
It's not distributed, though, so for large tables the MapReduce program is preferable.
REST and Thrift
HBase ships with REST and Thrift interfaces. These are useful when the interacting ap-
plication is written in a language other than Java. In both cases, a Java server hosts an in-
stance of the HBase client brokering REST and Thrift application requests into and out of
the HBase cluster. Consult the Reference Guide for information on running the services,
and the client interfaces.
Search WWH ::




Custom Search