Database Reference
In-Depth Information
HBase Versus RDBMS
HBase and other column-oriented databases are often compared to more traditional and
popular relational databases, or RDBMSs. Although they differ dramatically in their imple-
mentations and in what they set out to accomplish, the fact that they are potential solutions
to the same problems means that despite their enormous differences, the comparison is a
fair one to make.
As described previously, HBase is a distributed, column-oriented data storage system. It
picks up where Hadoop left off by providing random reads and writes on top of HDFS. It
has been designed from the ground up with a focus on scale in every direction: tall in num-
bers of rows (billions), wide in numbers of columns (millions), and able to be horizontally
partitioned and replicated across thousands of commodity nodes automatically. The table
schemas mirror the physical storage, creating a system for efficient data structure serializa-
tion, storage, and retrieval. The burden is on the application developer to make use of this
storage and retrieval in the right way.
Strictly speaking, an RDBMS is a database that follows Codd's 12 rules . Typical RDBMSs
are fixed-schema, row-oriented databases with ACID properties and a sophisticated SQL
query engine. The emphasis is on strong consistency, referential integrity, abstraction from
the physical layer, and complex queries through the SQL language. You can easily create
secondary indexes; perform complex inner and outer joins; and count, sum, sort, group, and
page your data across a number of tables, rows, and columns.
For a majority of small- to medium-volume applications, there is no substitute for the ease
of use, flexibility, maturity, and powerful feature set of available open source RDBMS
solutions such as MySQL and PostgreSQL. However, if you need to scale up in terms of
dataset size, read/write concurrency, or both, you'll soon find that the conveniences of an
RDBMS come at an enormous performance penalty and make distribution inherently diffi-
cult. The scaling of an RDBMS usually involves breaking Codd's rules, loosening ACID
restrictions, forgetting conventional DBA wisdom, and, on the way, losing most of the de-
sirable properties that made relational databases so convenient in the first place.
Successful Service
Here is a synopsis of how the typical RDBMS scaling story runs. The following list pre-
sumes a successful growing service:
Initial public launch
Move from local workstation to a shared, remotely hosted MySQL instance with a well-
defined schema.
Search WWH ::




Custom Search