Databases Reference
In-Depth Information
systems.” WAL is a common technique used across a variety of database systems, including the
popular relational database systems like PostgreSQL and MySQL. In HBase a client program
could decide to turn WAL on or switch it off. Switching it off would boost performance but reduce
reliability and recovery, in case of failure. When data is written to a region, it's fi rst written to the
write-ahead-log, if enabled. Soon afterwards, it's written to the region's in-memory store. If the
in-memory store is full, data is fl ushed to disk and persisted in the underlying distributed storage.
See Figure 4-9, which recaps the core aspects of a region server and a region.
Region Server
Region
In-memory
Store
Write
Ahead
Log
Wrapper
File
Distributed
File
System
FIGURE 4-9
If a distributed fi lesystem like the Hadoop distributed fi lesystem (HDFS) is used, then a master-
worker pattern extends to the underlying storage scheme as well. In HDFS, a namenode and a set
of datanodes form a structure analogous to the confi guration of master and range servers that
column databases like HBase follow. Thus, in such a situation each physical storage fi le for an
HBase column-family store ends up residing in an HDFS datanode. HBase leverages a fi lesystem
API to avoid strong coupling with HDFS and so this API acts as the intermediary for conversations
between an HBase store and a corresponding HDFS fi le. The API allows HBase to work seamlessly
with other types of fi lesystems as well. For example, HBase could be used with CloudStore, formerly
known as Kosmos FileSystem (KFS), instead of HDFS.
Read more about CloudStore, formerly known as Kosmos FileSystem (KFS), at
http://kosmosfs.sourceforge.net/.
Search WWH ::




Custom Search