Database Reference
In-Depth Information
Merging regions
Load balancing and region splitting are very common approaches for performance
tuning in HBase. However, for cases where a large chunk of data is either deleted
or there is a requirement to reduce the number of regions on region server, HBase
provides a tool that allows merging of two adjacent regions. This operation should
only be performed when the HBase cluster is ofline. The following is the command-
line argument to be executed on any region server:
$ ./bin/hbase org.apache.hadoop.hbase.util.Merge <table-name> <region-1>
<region-2>
MemStore-local allocation buffers
MemStore-local allocation buffers ( MSLABs ) are ixed size buffers that contain
KeyValue instances of varying sizes. As soon as the KeyValue instances are lushed
to the disk, the process causes holes in the old generation heap and might cause
fragment-related issues. In the case of MSLAB, objects of the same size are allocated
from the heap. Once these objects tenure and are collected, they leave holes in the
heap of a speciic size, and further allocations of new objects of the exact same size
will always reuse these holes.
Whenever a buffer is not able to observe a newly added KeyValue, it is considered full
and a new ixed size buffer is created. Availability of this feature is controlled by the
coniguration property called hbase.hregion.memstore.mslab.enabled , and the
size (the default is 2 MB) of a ixed-sized buffer is controlled by the hbase.hregion.
memstore.mslab.chunksize property. It also deines the upper size limit ( the default
is 256 KB) of KeyValue instances that can be stored in the buffer. Instances of larger
sizes get stored directly in the Java heap and can again become problematic.
KeyValue instances do not occupy the complete space in any buffer and deinitely
waste some of the capacity and cause slow performance as well if not used correctly.
JVM tuning
Tuning JVM for garbage collection parameters is a must do for the region server
processes as these are the ones handling all the data volumes and I/O. Region
servers do not work well with the default setting in JVM, especially with heavy write
operations as with such use cases; the MemStores are creating and discarding objects
at various stages and the data is collected in the in-memory buffers that get lushed
to the disk when the buffers are full.
 
Search WWH ::




Custom Search