Database Reference
In-Depth Information
e40be0371f43135e36ea67edec6e31e3/
cf1
0 2014-02-28 16:40 /hbase/my_table/
e40be0371f43135e36ea67edec6e31e3/
cf2
As can be seen, four subdirectories have been created under /hbase/mytable .
Each subdirectory is named by taking the hash of its respective region name, which
includes the start and end rows. Under each of these directories are the directories
for the column families, cf1 and cf2 in the example, and the .regioninfo
file, which contains several options and attributes for how the regions will be
maintained. The column family directories store keys and values for the
corresponding column qualifiers. The column qualifiers from one column family
should seldom be read with the column qualifiers from another column family. The
reason for the separate column families is to minimize the amount of unnecessary
data that HBase has to sift through within a region to find the requested data.
Requesting data from two column families means that multiple directories have to
be scanned to pull all the desired columns, which defeats the purpose of creating
the column families in the first place. In such cases, the table design may be better
off with just one column family. In practice, the number of column families should
be no more than two or three. Otherwise, performance issues may arise [30].
The following operations add data to the table using the put command. From
these three put operations, data1 and data2 are entered into column qualifiers,
cq1 and cq2 , respectively, in column family cf1 . The value data3 is entered into
column qualifier cq3 in column family cf2 . The row is designated by row key
000700 in each operation.
hbase> put 'my_table', '000700', 'cf1:cq1', 'data1'
0 row(s) in 0.0030 seconds
hbase> put 'my_table', '000700', 'cf1:cq2', 'data2'
0 row(s) in 0.0030 seconds
hbase> put 'my_table', '000700', 'cf2:cq3', 'data3'
0 row(s) in 0.0040 seconds
Search WWH ::




Custom Search