Database Reference
In-Depth Information
implementations. For example, Twitter's Rainbird project implements a
hierarchical streaming counting system using a database called Cassandra.
The current second-generation systems have moved beyond task-specific
implementations into providing a general service for arranging streaming
computation. These systems, such as Storm (also a Twitter project, but
originally developed by a company called BackType) typically provide a
directedacyclicgraph(DAG)mechanismforarrangingthecomputationand
moving messages between different parts of the computation.
These frameworks, which are relatively young as software projects, are still
often under heavy development. They also lack the guidance of
well-understood computational paradigms, which means that they have to
learn what works well and what does not as they go along. As their
development progresses, there will probably be an erosion in the difference
between batch-based systems and streaming systems. The map-reduce
model is already gaining the ability to implement more complicated
computation to support the demand for interactive querying as parts of
projects like Cloudera's Impala and the Stinger project being led by
Hortonworks.
Storage
Storage options for real-time processing and analysis are plentiful—almost
to the point of being overwhelming. Whereas traditional relational
databasescanandareusedinstreaming systems,thepreferenceisgenerally
for so-called “NoSQL” databases. This preference has a number of different
drivers,butthelargesthasgenerallybeentheneedtoprioritizeperformance
over the ACID (Atomicity, Consistency, Isolation, Durability) requirements
met by the traditional relational database. Although these requirements are
often met to some degree, a common feature of these databases is that they
rarely support all the requirements.
The other thing that these databases usually lack when compared to a
relational database is a richness of schema and query language. The NoSQL
moniker is a reference to the fact that this class of database usually has
a much more restricted query language than that allowed by the SQL
(Standard Query Language) standard. This simplified and/or restricted
query language is usually coupled with a restricted schema, if any formal
schema mechanism exists at all.
Search WWH ::




Custom Search