Database Reference
In-Depth Information
(
Chapter 19
)
is a cluster computing framework for large-scale data processing; it provides
a
directed acyclic graph
(DAG) engine, and APIs in Scala, Java, and Python.
Chapter 20
is an introduction to HBase, a distributed column-oriented real-time database
that uses HDFS for its underlying storage. And
Chapter 21
is about ZooKeeper, a distrib-
uted, highly available coordination service that provides useful primitives for building dis-
tributed applications.
Finally,
Part V
is a collection of case studies contributed by people using Hadoop in inter-
esting ways.
Supplementary information about Hadoop, such as how to install it on your machine, can
be found in the appendixes.