Database Reference
In-Depth Information
Tutorial Links
YARN is still an evolving technology, and the official Apache guide is really the best place
to get started.
Example Code
The truth is that writing applications in Yarn is still very involved and too deep for this topic.
You can find a link to an excellent walk-through for building your first Yarn application in
the preceding “Tutorial Links” section.
Spark
License
Apache License, Version 2.0
Activity
High
Purpose
Processing/Storage
Official Page
http://spark.apache.org/
Hadoop Integration API Compatible
MapReduce is the primary workhorse at the core of most Hadoop clusters. While highly ef-
fective for very large batch-analytic jobs, MapReduce has proven to be suboptimal for ap-
plications like graph analysis that require iterative processing and data sharing.
Spark is designed to provide a more flexible model that supports many of the multipass ap-
plications that falter in MapReduce. It accomplishes this goal by taking advantage of
memory whenever possible in order to reduce the amount of data that is written to and read
from disk. Unlike Pig and Hive, Spark is not a tool for making MapReduce easier to use. It is
a complete replacement for MapReduce that includes its own work execution engine.
Search WWH ::




Custom Search