Database Reference
In-Depth Information
Relational Databases
While NoSQL stores have been in vogue, it has been claimed that relational
databases are somehow threatened. This is simply untrue; relational
databases are alive and well and very good at what they do. They excel at
ad hoc aggregation and querying. This is especially true of column-oriented
databases, which have specialized indexing and storage mechanisms to
optimize many typical aggregation queries.
Onethingtheyusuallydon'tdowellisveryhigh-speedingest.“Real-time”in
the relational world usually refers to a minutes or hours delay in processing
a data stream rather than milliseconds. (There are exceptions; several
companies offer databases that specialize in high-performance ingest.) It
usually works best to manage them through a slower ETL process, which
will be discussed shortly.
Warehousing
Many businesses already have an existing business intelligence
infrastructure. This usually consists of a relational database environment
loaded by some sort of extract-transform-load (ETL) tool.
At large organizations this is typically managed by a database team using
a set of formal processing tools. At smaller organizations this may be more
ad hoc with a small MySQL database loaded by some scripts that nobody
remembers writing. In both cases, there is a bit of a problem. The velocity
of streaming data overwhelms the ETL infrastructure or the database due
to the constant updates from the real-time processing system. Furthermore,
when (not if) bugs are introduced, the ability to recover from the error and
reprocess data helps improve operational efficiency and uptime. With the
introduction of modern tools for handling this processing, the warehousing
process can be more consistently implemented and scale more readily even
at smaller organizations that cannot invest in a complicated ETL
infrastructure.
Hadoop as ETL and Warehouse
Since its public introduction in 2007, Hadoop has become an almost de
facto standard for the development of large-scale processing and ETL tasks.
Ithasaccomplishedthisfeatinspiteofafairlylimitedprocessingmodeldue
Search WWH ::




Custom Search