Databases Reference
In-Depth Information
Table 2-1. ( continued )
Traditional IT
Web-Scale
Applications
Big Data Analytics
Initiatives
Solution Stack
Traditional
IT enterprise
platforms, mostly
ACID compliant.
Less proprietary,
open source centric,
commodity hardware
centric (LAMP
Stack - Linux,
Apache, MySQL,
PHP)
Highly open source
centric, scalability,
performance is the key,
consistency can be
compromised (SMAQ
Stack - Storage, Map-
Reduce, Query)
Database
RDBMS platforms
MySQL
NoSQL
Compute
Proprietary
Distributed
processing, Linux
on large number of
commodity servers
Distributed processing,
data is node aware,
Linux on large number
of commodity servers
running map-reduce
jobs
Storage
Expensive SANs
Scale out commodity
NAS
Scale out commodity
NAS, Hadoop
compatible file systems
The traditional IT stack (let's define it as “database, storage, and computing”) that
worked quite well for a relatively small amount of highly valuable and highly structured data
began to show serious limitations when faced with a number of challenges. For example, the
emergence of web-enabled applications (and millions of user bases) needed cost-effective
and innovative approaches to enable distribution of computing and processing of data
across large numbers of commodity servers. Let's define this as the LAMP stack (Linux,
Apache, MySQL, and PHP). Almost every business is now a “digital business,” thus the
explosive growth in unstructured data (e.g., text, video, audio, medical images, etc.) all around
us. This is why a new stack for IT called SMAQ (storage, map-reduce and query) has emerged.
Let's discuss a few examples to fully understand the implications of the SMAQ
stack. Imagine, for example, that you are not only trying to store billions of interactions
happening in social chatter boxes but also trying to perform sophisticated analytics on
those interactions: such as sentiments expressed by people about a particular brand of
product, correlations analysis, and taking these sentiments and linking them to your
product sales across stores. Conventional databases can neither handle this kind of
scale nor do they have the ability to quickly provide answers to these kinds of questions.
Relational databases were designed to maintain transaction history (the ACID principles)
in a highly consistent manner and thus inherently they have limitations on scalability
and performance. The scale in our example necessitates that we follow a distributed data,
storage, and processing approach. The conventional storage and computing approaches
can't handle this kind of scale and complexity.
 
 
Search WWH ::




Custom Search