Databases Reference
In-Depth Information
Table 1.2 The key case studies associated with the NoSQL movement—the name of the case study/
standard, the business drivers, and the results (findings) of the selected solutions (continued)
Case study/standard
Driver
Finding
Google's Bigtable
Need to flexibly store tabular
data in a distributed system.
By using a sparse matrix approach,
users can think of all data as being
stored in a single table with billions of
rows and millions of columns without the
need for up-front data modeling.
Amazon's Dynamo
Need to accept a web order 24
hours a day, 7 days a week.
A key-value store with a simple interface
can be replicated even when there are
large volumes of data to be processed.
MarkLogic
Need to query large collections
of XML documents stored on
commodity hardware using stan-
dard query languages.
By distributing queries to commodity
servers that contain indexes of XML doc-
uments, each server can be responsible
for processing data in its own local disk
and returning the results to a query
server.
1.3.1
Case study: LiveJournal's Memcache
Engineers working on the blogging system LiveJournal started to look at how their sys-
tems were using their most precious resource: the RAM in each web server. Live-
Journal had a problem. Their website was so popular that the number of visitors using
the site continued to increase on a daily basis. The only way they could keep up with
demand was to continue to add more web servers, each with its own separate RAM .
To improve performance, the LiveJournal engineers found ways to keep the results
of the most frequently used database queries in RAM , avoiding the expensive cost of
rerunning the same SQL queries on their database. But each web server had its own
copy of the query in RAM ; there was no way for any web server to know that the server
next to it in the rack already had a copy of the query sitting in RAM .
So the engineers at LiveJournal created a simple way to create a distinct “signa-
ture” of every SQL query. This signature or hash was a short string that represented a
SQL SELECT statement. By sending a small message between web servers, any web
server could ask the other servers if they had a copy of the SQL result already exe-
cuted. If one did, it would return the results of the query and avoid an expensive
round trip to the already overwhelmed SQL database. They called their new system
Memcache because it managed RAM memory cache.
Many other software engineers had come across this problem in the past. The con-
cept of large pools of shared-memory servers wasn't new. What was different this time
was that the engineers for LiveJournal went one step further. They not only made this
system work (and work well), they shared their software using an open source license,
and they also standardized the communications protocol between the web front ends
(called the memcached protocol ). Now anyone who wanted to keep their database from
getting overwhelmed with repetitive queries could use their front end tools.
 
Search WWH ::




Custom Search