Databases Reference
In-Depth Information
replicated with the data and could be swapped in if the master NameNode failed.
Since 2010, there have been specialized releases of Hadoop that removed the single
point of failure of the Hadoop NameNode.
Although the NameNode was a weak link in setting up early Hadoop clusters, it
was usually not the primary cause of most service failures. Facebook did a study of
their service failures and found that only 10% were related to NameNode failures.
Most were the result of human error or systematic bugs on all Hadoop nodes.
8.3.4
Using a managed NoSQL service
Organizations find that even with an advanced NoSQL database, it takes a huge
amount of engineering to create and maintain predictable high-availability data ser-
vices that scale. Unless you have a large IT budget and specialized staff, you'll find it
more cost effective to let companies experienced in database setup and configuration
handle the job and let your own staff focus on application development. Today the
costs for using cloud-based NoSQL applications are a fraction of what internal IT
departments charge to set up and configure systems.
Let's take a look at how an Amazon DynamoDB key-value store can be configured
to give you high-availability.
8.3.5
Case study: using Amazon DynamoDB
for a high-availability data store
The original Amazon DynamoDB paper, introduced in chapter 1, was one of the most
influential papers in the NoSQL movement. This paper detailed how Amazon
rejected RDBMS designs and used its own custom distributed computing system to sup-
port the requirements of horizontal scalability and high availability for their web shop-
ping cart.
Originally, Amazon didn't make the DynamoDB software open source. Yet despite
the lack of source code, the DynamoDB paper heavily influenced other NoSQL sys-
tems such as Cassandra, Redis, and Riak. In February 2012, Amazon made DynamoDB
available as a database service for other developers to use. This case study reviews the
Amazon DynamoDB service and how it can be used as a fully managed, highly avail-
able, scalable database service.
Let's start by looking at DynamoDB's high-level features. Dynamo's key innovation
is its ability to quickly and precisely tune throughput. The service can reliably handle a
large volume of read and write transactions, which can be tuned on a minute-by-
minute basis by modifying values on a web page. Figure 8.4 shows an example of this
user interface.
DynamoDB handles how many servers are used and how the loads are balanced
between the servers. Amazon provides an API so you can change the provisioned
throughput programmatically based on the results of your load monitoring system.
No operator intervention is required. Your monthly Amazon bill will be automatically
adjusted as these parameters change.
Search WWH ::




Custom Search