Database Reference
In-Depth Information
SCANNERS
HBase scanners are like cursors in a traditional database or Java iterators, except — unlike the latter —
they have to be closed after use. Scanners return rows in order. Users obtain a scanner on a Table ob-
ject by calling getScanner() , passing a configured instance of a Scan object as a parameter. In the
Scan instance, you can pass the row at which to start and stop the scan, which columns in a row to re-
turn in the row result, and a filter to run on the server side. The ResultScanner interface, which is
returned when you call getScanner() , is as follows:
public interface ResultScanner extends Closeable , Iterable < Result > {
public Result next () throws IOException ;
public Result [] next ( int nbRows ) throws IOException ;
public void close ();
}
You can ask for the next row's results, or a number of rows. Scanners will, under the covers, fetch
batches of 100 rows at a time, bringing them client-side and returning to the server to fetch the next
batch only after the current batch has been exhausted. The number of rows to fetch and cache in this way
is determined by the hbase.client.scanner.caching configuration option. Alternatively, you
can set how many rows to cache on the Scan instance itself via the setCaching() method.
Higher caching values will enable faster scanning but will eat up more memory in the client. Also, avoid
setting the caching so high that the time spent processing the batch client-side exceeds the scanner
timeout period. If a client fails to check back with the server before the scanner timeout expires, the serv-
er will go ahead and garbage collect resources consumed by the scanner server-side. The default scanner
timeout is 60 seconds, and can be changed by setting
hbase.client.scanner.timeout.period . Clients will see an UnknownScannerExcep-
tion if the scanner timeout has expired.
The simplest way to compile the program is to use the Maven POM that comes with the
book's example code. Then we can use the hbase command followed by the classname
to run the program. Here's a sample run:
% mvn package
% export HBASE_CLASSPATH=hbase-examples.jar
% hbase ExampleClient
Get: keyvalues={row1/data:1/1414932826551/Put/vlen=6/mvcc=0}
Scan: keyvalues={row1/data:1/1414932826551/Put/vlen=6/mvcc=0}
Scan: keyvalues={row2/data:2/1414932826564/Put/vlen=6/mvcc=0}
Scan: keyvalues={row3/data:3/1414932826566/Put/vlen=6/mvcc=0}
Each line of output shows an HBase row, rendered using the toString() method from
Result . The fields are separated by a slash character, and are as follows: the row name,
the column name, the cell timestamp, the cell type, the length of the value's byte array
( vlen ), and an internal HBase field ( mvcc ). We'll see later how to get the value from a
Result object using its getValue() method.
Search WWH ::




Custom Search