Database Reference
In-Depth Information
Shared Nothing was more recently popularized by Google, which has written systems such as its
Bigtable database and its MapReduce implementation that do not share state, and are therefore
capable of near-infinite scaling. The Cassandra database is a shared-nothing architecture, as it
has no central controller and no notion of master/slave; all of its nodes are the same.
NOTE
You can read the 1986 paper “The Case for Shared Nothing” online at http://db.cs.berkeley.edu/papers/
hpts85-nothing.pdf . It's only a few pages. If you take a look, you'll see that many of the features of
shared-nothing distributed data architecture, such as ease of high availability and the ability to scale to a
very large number of machines, are the very things that Cassandra excels at.
MongoDB also provides auto-sharding capabilities to manage failover and node balancing. That
many nonrelational databases offer this automatically and out of the box is very handy; creating
and maintaining custom data shards by hand is a wicked proposition. It's good to understand
sharding in terms of data architecture in general, but especially in terms of Cassandra more spe-
cifically, as it can take an approach similar to key-based sharding to distribute data across nodes,
but does so automatically.
Summary
In summary, relational databases are very good at solving certain data storage problems, but be-
cause of their focus, they also can create problems of their own when it's time to scale. Then,
you often need to find a way to get rid of your joins, which means denormalizing the data, which
means maintaining multiple copies of data and seriously disrupting your design, both in the data-
base and in your application. Further, you almost certainly need to find a way around distributed
transactions, which will quickly become a bottleneck. These compensatory actions are not dir-
ectly supported in any but the most expensive RDBMS. And even if you can write such a huge
check, you still need to carefully choose partitioning keys to the point where you can never en-
tirely ignore the limitation.
Perhaps more importantly, as we see some of the limitations of RDBMS and consequently some
of the strategies that architects have used to mitigate their scaling issues, a picture slowly starts
to emerge. It's a picture that makes some NoSQL solutions seem perhaps less radical and less
scary than we may have thought at first, and more like a natural expression and encapsulation of
some of the work that was already being done to manage very large databases.
Web Scale
An invention has to make sense in the world in which it is finished, not the world in which it is started.
—Ray Kurzweil
Search WWH ::




Custom Search