The HBase Architecture - HBase Essentials

Database Reference

In-Depth Information

The HBase Architecture

In the previous chapters, we learned the basic building blocks of HBase schema

designing and applying the CRUD operations over the designed schema. In

this chapter, we will look at HBase from its architectural view point on the

following topics:

• Data storage

• Data replication

• Securing HBase

For most of the developers or users, the preceding topics are not of big interest,

but for an administrator, it really makes sense to understand how underlying data

is stored or replicated within HBase. Administrators are the people who deal with

HBase, starting from its installation to cluster management (performance tuning,

monitoring, failure, recovery, data security, and so on).

By the end of this chapter, we will also get an insight into the integration of HBase

and Map Reduce. Let's start with data storage in HBase irst.

Data storage

In HBase, tables are split into smaller chunks that are distributed across multiple

servers. These smaller chunks are called regions and the servers that host regions are

called RegionServers . The master process handles the distribution of regions among

RegionServers, and each RegionServer typically hosts multiple regions. In HBase

implementation, the HRegionServer and HRegion classes represent the region server

and the region, respectively. HRegionServer contains the set of HRegion instances

available to the client and handles two types of iles for data storage:

• HLog (the write-ahead log ile, also known as WAL)

• HFile (the real data storage ile)

Search WWH ::

Custom Search

Home