Database Reference
In-Depth Information
Batch writes
Currently, in our code, each time we call htable.put (one_put) , we make an RPC
call to an HBase region server. This round-trip delay can be minimized if we call
htable.put() with a bunch of put records. Then, with one round trip, we can insert
a bunch of records into HBase. This is called batch puts .
Here is an example of batch puts. Only the relevant section is shown for clarity. For
the full code, see hbase_dp.ch8.UserInsert3.java :
int total = 100;
long t4a = System.currentTimeMillis();
List<Put> puts = new ArrayList<>();
for (int i = 0; i < total; i++) {
int userid = i;
String email = "user-" + i + "@foo.com";
String phone = "555-1234";
byte[] key = Bytes.toBytes(userid);
Put put = new Put(key);
put.add(Bytes.toBytes(familyName), Bytes.toBytes("email"),
Bytes.toBytes(email));
put.add(Bytes.toBytes(familyName), Bytes.toBytes("phone"),
Bytes.toBytes(phone));
puts.add(put); // just add to the list
}
htable.put(puts); // do a batch put
long t4b = System.currentTimeMillis();
System.out.println("inserted " + total + " users in " + (t4b
- t4a) + " ms");
A sample run with a batch put is as follows:
inserted 100 users in 48 ms
The same code with individual puts took around 350 milliseconds!
Use batch writes when you can to minimize latency.
Note that the HTableUtil class that comes with HBase implements some smart
batching options for your use and enjoyment.
 
Search WWH ::




Custom Search