Database Reference
In-Depth Information
Chapter 1
The Story of Big Data at Google
Since its founding in 1998, Google has grown by multiple orders of magnitude
in several different dimensions—how many queries it handles, the size of
the search index, the amount of user data it stores, the number of services
it provides, and the number of users who rely on those services. From a
hardware perspective, the Google Search engine has gone from a server
sitting under a desk in a lab at Stanford to hundreds of thousands of servers
located in dozens of datacenters around the world.
The traditional approach to scaling (outside of Google) has been to scale the
hardware up as the demands on it grow. Instead of running your database
on a small blade server, run it on a Big Iron machine with 64 processors and
a terabyte of RAM. Instead of relying on inexpensive disks, the traditional
scaling path moves critical data to costly network-attached storage (NAS).
There are some problems with the scale-up approach, however:
• Scaled-up machines are expensive. If you need one that has twice the
processing power, it might cost you five times as much.
• Scaled-up machines are single points of failure. You might need to get
more than one expensive server in case of a catastrophic problem, and
each one usually ends up being built with so many backup and
redundant pieces that you're paying for a lot more hardware than you
actually need.
• Scale up has limits. At some point, you lose the ability to add more
processors or RAM; you've bought the most expensive and fastest
machine that is made (or that you can afford), and it still might not be
fast enough.
• Scale up doesn't protect you against software failures. If you have a Big
Iron server that has a kernel bug, that machine will crash just as easily
(and as hard) as your Windows laptop.
Google, from an early point in time, rejected scale-up architectures. It didn't,
however, do this because it saw the limitations more clearly or because it
was smarter than everyone else. It rejected scale-up because it was trying
to save money. If the hardware vendor quotes you $1 million for the server
Search WWH ::




Custom Search