Database Reference
In-Depth Information
• Heavy write: Data written goes into the MemStore and are lushed to
form new HFiles. These HFiles are compacted. As a best practice, lushing,
compacting, or splitting should not happen too often as these processes
increase the I/O, thus causing the slower cluster performance. Some
recommendations are as follows:
° Keep the region size larger to avoid splits at write time
° Keep the HFile size larger to avoid compaction
• Heavy sequential reads—some recommendations are as follows:
° Higher block size to read more data per seek
° Avoid caching on table
• Heavy random reads: Effective use of the cache and better indexing
will get higher performance. A few recommendations are as follows:
° Use a higher-block level cache and lower down the MemStore limit
° For better indexing, use the smaller block size
° Use bloom filters at column family level
In the case of mixed use of heavy read and write, all the performance tuning
parameters should be given a serious look and would require multiple rounds
of tuning to get the optimized coniguration.
Troubleshooting
An HBase cluster does not run smoothly and expectedly sometimes, especially with
bad coniguration. This section covers the troubleshooting tools and techniques in
brief for the HBase cluster running with ambiguous status. There are certain tools
that are used while troubleshooting the HBase cluster. The following are some of
the important tools that are preferred to be known to the administrators:
jps : This tool shows the Java processes running for the current user.
$ $JAVA_HOME/bin/jps
jmap : This tool is used to view the Java heap summary. For example,
the following command shows the summary for the HRegionServer
daemon's heap:
$ $JAVA_HOME/bin/jmap -heap 1812
 
Search WWH ::




Custom Search