Databases Reference
In-Depth Information
FIGURE 4.21
Sqoop2 architecture.
volumes of data efficiently. The next section discusses another popular set of technologies classified
as NoSQL.
NoSQL
Relational databases cannot handle the scalability requirements of large volumes of transactional data, and
often fail when trying to scale up and scale out. The vendors of RDBMS-based technologies have tried
hard to address the scalability problem by replication, distributed processing, and many other models, but
the relational architecture and the ACID properties of the RDBMS have been a hindrance in accomplishing
the performance requirements of applications, such as sensor networks, web applications, trading platforms,
and much more. In the late 1980s there were a number of research papers that were published about newer
models of SQL databases, but not based on ACID requirements and the relational model. Fast forward to
1998 when there was the emergence of a new class of databases that could support the requirements of
high-speed data in a pseudo-database environment but were not oriented completely toward SQL. The name
NoSQL (not only SQL) database was coined by Eric Evans for the user group meeting to discuss the need
for nonrelational and non-SQL-driven databases. This name has become the industry-adopted name for a
class of databases that work on similar architectures but are purpose-built for different workloads.
There were three significant papers that changed the NoSQL database from being a niche solution
to become an alternative platform:
“Google Publishes the BigTable Architecture” ( http://labs.google.com/papers/bigtable.html ) .
“Eric Brewer discusses the CAP Theorem” ( http://lpd.epl.ch/sgilbert/pubs/BrewersConjecture-
SigAct.pdf ).
“Amazon publishes Dynamo” ( http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html ).
Dynamo presented a highly available key-value store infrastructure and BigTable presented a
data storage model based on a multidimensional sorted map, where a three-dimensional intersection
between a row key, column key, and timestamp provides access to any data in petabytes of data. Both
 
Search WWH ::




Custom Search