Database Reference
In-Depth Information
The HBase Architecture
In the previous chapters, we learned the basic building blocks of HBase schema
designing and applying the CRUD operations over the designed schema. In
this chapter, we will look at HBase from its architectural view point on the
following topics:
• Data storage
• Data replication
• Securing HBase
For most of the developers or users, the preceding topics are not of big interest,
but for an administrator, it really makes sense to understand how underlying data
is stored or replicated within HBase. Administrators are the people who deal with
HBase, starting from its installation to cluster management (performance tuning,
monitoring, failure, recovery, data security, and so on).
By the end of this chapter, we will also get an insight into the integration of HBase
and Map Reduce. Let's start with data storage in HBase irst.
Data storage
In HBase, tables are split into smaller chunks that are distributed across multiple
servers. These smaller chunks are called regions and the servers that host regions are
called RegionServers . The master process handles the distribution of regions among
RegionServers, and each RegionServer typically hosts multiple regions. In HBase
implementation, the HRegionServer and HRegion classes represent the region server
and the region, respectively. HRegionServer contains the set of HRegion instances
available to the client and handles two types of iles for data storage:
• HLog (the write-ahead log ile, also known as WAL)
• HFile (the real data storage ile)
Search WWH ::




Custom Search