Database Reference
In-Depth Information
support enormous volumes of data; the fact that it does stands as a monument to the ingenious
architecture of the Web.
But some of this infrastructure is starting to bend under the weight.
In 1966, a company like IBM was in a position to really make people listen to their innovations.
They had the problems, and they had the brain power to solve them. As we enter the second
decade of the 21st century, we're starting to see similar innovations, even from young companies
such as Facebook and Twitter.
So perhaps the real question, then, is not “What problem do I have?” but rather, “What kinds of
things would I do with data if it wasn't a problem?” What if you could easily achieve fault toler-
ance, availability across multiple data centers, consistency that you tune, and massive scalability
even to the hundreds of terabytes, all from a client language of your choosing? Perhaps, you say,
you don't need that kind of availability or that level of scalability. And you know best. You're
certainly right, in fact, because if your current database didn't suit your current database needs,
you'd have a nonfunctioning system.
It is not my intention to convince you by clever argument to adopt a non-relational database such
as Apache Cassandra. It is only my intention to present what Cassandra can do and how it does
it so that you can make an informed decision and get started working with it in practical ways if
you find it applies. Only you know what your data needs are. I do not ask you to reconsider your
database—unless you're miserable with your current database, or you can't scale how you need
to already, or your data model isn't mapping to your application in a way that's flexible enough
for you. I don't ask you to consider your database, but rather to consider your organization, its
dreams for the future, and its emerging problems. Would you collect more information about
your business objects if you could?
Don't ask how to make Cassandra fit into your existing environment. Ask what kinds of data
problems you'd like to have instead of the ones you have today. Ask what new kinds of data you
would like. What understanding of your organization would you like to have, if only you could
enable it?
A Quick Review of Relational Databases
Though you are likely familiar with them, let's briefly turn our attention to some of the founda-
tional concepts in relational databases. This will give us a basis on which to consider more recent
advances in thought around the trade-offs inherent in distributed data systems, especially very
large distributed data systems, such as those that are required at web scale.