Processing Streaming Data - Real-Time Analytics

Database Reference

In-Depth Information

managing a large number of jobs. This allows each ApplicationMaster

to operate independently in Map-Reduce settings. It also allows for more

sophisticated resource management and security models because jobs are

now essentially completely independent.

Relationship to Samza

Samza is implemented as an application on top of YARN. The Samza

application hastherequired ApplicationManager thatisusedtomanage

Samza TaskRunners hosted within YARN Containers . The

TaskRunners execute StreamTasks , which are the Samza equivalent of a

Storm Bolt .

All of Samza's communication is hosted through Kafka brokers. Like HDFS

DataNodes in a Hadoop Map-Reduce application, these brokers are usually

co-located on the same machines hosting the Samza Containers. Samza

then uses Kafka's topics and natural partitioning to implement many of the

grouping features found in stream processing applications.

Getting Started with YARN and Samza

Although Hadoop 2 has been available for some time, it is still not

particularly common in production environments, though that is changing

rapidly. Most importantly for many users, Hadoop 2 is now supported by

Amazon's Elastic MapReduce product as a general release, making it easy to

spin up a cluster.

Apache YARN is also now supported by at least two of the major Hadoop

distributions, with more being added. Using their respective cluster

management tools to set up a YARN cluster is fairly painless. The only

downside is that packaged distributions tend to have a somewhat arbitrary

set of patches and versions that may lag the most recently released version

of the Apache project.

Additionally, it is possible to spin up a cluster using the Apache packages

either on a single node for experimentation or in a distributed fashion.

Single Node Samza

The easiest way to get started with Samza on a single node is to use the

single-node YARN installation packaged with the Hello Samza project. This

Search WWH ::

Custom Search

Home