Database Reference
In-Depth Information
In HBase, there is a system-deined catalog table called
hbase:meta
that keeps the
list of all the regions for user-deined tables.
In older versions prior to 0.96.0, HBase had two catalog tables called
-
ROOT-
and
.META
. The
-ROOT-
table was used to keep track of the
location of the
.META
table. Version 0.96.0 onwards, the
-ROOT-
table
is removed. The
.META
table is renamed as
hbase:meta
. Now,
the location of
.META
is stored in ZooKeeper. The following is the
structure of the
hbase:meta
table.
Key—the region key of the format
([table],[region start
key],[region id])
. A region with an empty start key is the irst
region in a table.
The values are as follows:
•
info:regioninfo
(serialized the
HRegionInfo
instance for
this region)
•
info:server
(server:port of the RegionServer containing this
region)
•
info:serverstartcode
(start time of the RegionServer
process that contains this region)
When the table is split, two new columns will be created as
info:splitA
and
info:splitB
. These columns represent the two newly created regions. The values
for these columns are also serialized as
HRegionInfo
instances. Once the split
process is complete, the row that contains the old region information is deleted.
In the case of data reading, the client application irst connects to ZooKeeper and
looks up the location of the
hbase:meta
table. For the next client, the
HTable
instance queries the
hbase:meta
table and inds out the region that contains the
rows of interest and also locates the region server that is serving the identiied
region. The information about the region and region server is then cached by the
client application for future interactions and avoids the lookup process. If the region
is reassigned by the load balancer process or if the region server has expired, a fresh
lookup is done on the
hbase:meta
catalog table to get the new location of the user
table region and the cache is updated accordingly.
At the object level, the
HRegionServer
class is responsible for creating a connection
with the region by creating
HRegion
objects. This
HRegion
instance sets up a store
instance that has one or more
StoreFile
instances (wrapped around HFile) and
MemStore. MemStore accumulates the data edits as it happens and buffers them into
the memory. This is also important for accessing the recent edits of the table data.