HBase Administration - HBase Essentials

Database Reference

In-Depth Information

Performance tuning

HBase design and performance is very much dependent on how it is being used

with any application. Though all metrics and tools discussed in the previous section

provide the system-level monitoring, but this does not ensure that the application

accessing HBase is getting optimal performance from HBase. However, monitoring

deinitely ensures that the statistics help in optimizing the performance, for example,

the Put , Get , or Scan performance achieved by the client for any region server,

network latencies, number of concurrent clients and many more. All this information

actually not only helps to ine tune HBase but also in understanding the client

expectation from HBase.

Performance is always measured in terms of the response times of the performed

operations. However, this response time is also measured in the context of

client's needs; for example, an online shopping application backed by an HBase

cluster should get a response in milliseconds, whereas in the case of reporting an

application, if the nightly reporting is getting a few seconds more response time

by HBase (used as backend), is still doable.

HBase is a distributed database built on top of Hadoop, and its performance is

affected by everything, starting from server hardware to network devices used

to connect these servers, operating systems, JVM, and the very important HDFS.

Hence, tuning HBase cluster performance typically requires tuning multiple

different coniguration parameters to it for the application requirements.

Let's look at the areas that require tuning to achieve the optimized HBase performance.

Compression

HBase provides support for various compression algorithms used for the column

family. Compression yields better performance as reading large amounts of data

puts more overhead on the CPU, compared to compression and decompression

of the same dataset. So, other than speciic data, such as compressed image, we

should always use compression.

Available codecs

The following are the different compression algorithms supported by HBase:

• Lempel-Ziv-Oberhumer ( LZO ): This algorithm is written in ANSI C and

requires Java native interface library for its integration with HBase. This

algorithm is highly focused on decompression speed and is also called

lossless data compression algorithm. Refer to http://wiki.apache.org/

hadoop/UsingLzoCompression for further details.

Search WWH ::

Custom Search

Home