Database Reference
In-Depth Information
Performance tuning
HBase design and performance is very much dependent on how it is being used
with any application. Though all metrics and tools discussed in the previous section
provide the system-level monitoring, but this does not ensure that the application
accessing HBase is getting optimal performance from HBase. However, monitoring
deinitely ensures that the statistics help in optimizing the performance, for example,
the Put , Get , or Scan performance achieved by the client for any region server,
network latencies, number of concurrent clients and many more. All this information
actually not only helps to ine tune HBase but also in understanding the client
expectation from HBase.
Performance is always measured in terms of the response times of the performed
operations. However, this response time is also measured in the context of
client's needs; for example, an online shopping application backed by an HBase
cluster should get a response in milliseconds, whereas in the case of reporting an
application, if the nightly reporting is getting a few seconds more response time
by HBase (used as backend), is still doable.
HBase is a distributed database built on top of Hadoop, and its performance is
affected by everything, starting from server hardware to network devices used
to connect these servers, operating systems, JVM, and the very important HDFS.
Hence, tuning HBase cluster performance typically requires tuning multiple
different coniguration parameters to it for the application requirements.
Let's look at the areas that require tuning to achieve the optimized HBase performance.
Compression
HBase provides support for various compression algorithms used for the column
family. Compression yields better performance as reading large amounts of data
puts more overhead on the CPU, compared to compression and decompression
of the same dataset. So, other than speciic data, such as compressed image, we
should always use compression.
Available codecs
The following are the different compression algorithms supported by HBase:
Lempel-Ziv-Oberhumer ( LZO ): This algorithm is written in ANSI C and
requires Java native interface library for its integration with HBase. This
algorithm is highly focused on decompression speed and is also called
lossless data compression algorithm. Refer to http://wiki.apache.org/
hadoop/UsingLzoCompression for further details.
 
Search WWH ::




Custom Search