Database Reference
In-Depth Information
Here is why every NoSQL developer knows Brewer's CAP theorem, which states
that you cannot have consistency ( C ), availability ( A ), and partition tolerance ( P ) all
at the same time in a distributed database. By consistency, we mean that all the
nodes in a distributed data see the same data at all times. Don't laugh, I have seen
systems that give you a different answer depending on the node you ask about.
This is bad enough for a social site and quite intolerable for a financial system.
Availability refers to the quality of a system that is used to give an answer in a
reasonable time interval. If it can't, it should fail outright. In no case do we want
a hanging system and deadlocks.
Finally, partition tolerance implies that the system keeps on operating even when
parts of it fail.
In terms of the CAP theorem, HBase is a CP-type system, one that stresses on
consistency and partition tolerance (for example, refer to http://en.wikipedia.
org/wiki/Apache_HBase ) . This is to say that HBase is highly available. HBase, after
all, runs behind Facebook. However, in the world of NoSQL, HBase is considered as
emphasizing on consistency and partitioning over availability. Now, let's apply this
theory to generate keys in the database.
If you ask the database to generate the key, then which part of it are you asking?
You might have some servers placed close to you and the rest placed remotely, and
you are asking the database to generate a unique sequence key. Obviously, you
are asking for consistency. If so, availability will suffer and your key generation
will become a bottleneck. So, by the nature of things, we are forced to generate our
primary key.
 
Search WWH ::




Custom Search