Database Reference
In-Depth Information
SQL Query
MapReduce Job
SMS Planner
MapReduce
Job
Hadoop core
Master node
MapReduce
Framework
HDFS
NameNode
JobTracker
InputFormat Implementations
Database Connector
Task with
InputFormat
Node 1
TaskTracker
Node 2
TaskTracker
Node n
TaskTracker
DataNode
Database
DataNode
Database
DataNode
Database
Fig. 9.15
The architecture of HadoopDB
used to analyze large-scale semi-structured data. It is a functional, declarative query
language which rewrites high-level queries when appropriate into a low-level query
consisting of Map-Reduce jobs that are evaluated using the Apache Hadoop project.
Core features include user extensibility and parallelism. Jaql consists of a scripting
language and compiler, as well as a runtime component [ 80 ]. It is able to process
data with no schema or only with a partial schema. However, Jaql can also exploit
rigid schema information when it is available, for both type checking and improved
performance.
Jaql uses a very simple data model, a JDM value is either an atom, an array or
a record. Most common atomic types are supported, including strings, numbers,
nulls and dates. Arrays and records are compound types that can be arbitrarily
nested. In more detail, an array is an ordered collection of values and can be used to
model data structures such as vectors, lists, sets or bags. A record is an unordered
collection of name-value pairs and can model structs, dictionaries and maps. Despite
its simplicity, JDM is very flexible. It allows Jaql to operate with a variety of
different data representations for both input and output, including delimited text
files, JSON files, binary files, Hadoop's sequence files, relational databases, key-
value stores or XML documents. Functions are first-class values in Jaql. They can
be assigned to a variable and are high-order in that they can be passed as parameters
or used as a return value. Functions are the key ingredient for reusability as any Jaql
expression can be encapsulated in a function, and a function can be parameterized
in powerful ways. Figure 9.16 depicts an example of a Jaql script that consists of a
Search WWH ::




Custom Search