Database Reference
In-Depth Information
Open Source Stack
Many of the technologies at Google have been publicly described in research
papers, which were picked up by the Open Source community and
re-implemented as open source versions. When the open source Big Data
options were in their infancy, they more or less followed Google's lead.
Hadoop was designed to be very similar to the architecture described in the
MapReduce paper, and the Hadoop subprojects HDFS and HBase are close
to GFS and BigTable.
However, as the value of scale-out systems began to increase (and as
problems with traditional scale-up solutions became more apparent), the
Open Source Big Data stack diverged significantly. A lot of effort has been
put into making Hadoop faster; people use technologies such as Hive and
Pig to query their data; and numerous NoSQL datastores have sprung up,
such as CouchDB, MongoDB, Cassandra, and others.
On the interactive query front, there are a number of open source options:
• Cloudera's Impala is an open source parallel execution engine similar to
Dremel. It allows you to query data inside HDFS and Hive without
extracting it.
Amazon.com 's Redshift is a fork of PostgreSQL which has been
modified to scale out across multiple machines. Unlike Impala, Redshift
is a hosted service, so it is managed in the cloud by Amazon.com .
• Drill is an Apache incubator project that aims to be for Dremel what
Hadoop was for MapReduce; Drill fills in the gaps of the Dremel paper
to provide a similar open source version.
• Facebook's Presto is a distributed SQL query engine that is similar to
Impala.
The days when Google held the clear advantage in innovation in the Big Data
space are over. Now, we're in an exciting time of robust competition among
different Big Data tools, technologies, and abstractions.
Google Cloud Platform
Google has released many of its internal infrastructure components to the
public under the aegis of the Google Cloud Platform. Google's public cloud
consists of a number of components, providing a complete Big Data
Search WWH ::




Custom Search