An Overview of Large-Scale Stream Processing Engines - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

Supervisor

ZooKeeper

Supervisor

Nimbus

ZooKeeper

Supervisor

ZooKeeper

Supervisor

FIGURE 12.7

Storm cluster.

In addition to the supported built-in stream grouping mechanisms, the Storm sys-

tem allows its users to define their own custom grouping mechanisms.

In general, a Storm cluster is superficially similar to a Hadoop cluster. One key

difference is that a MapReduce job eventually finishes while a Storm job processes

messages forever (or until the user kills it). In principle, there are two kinds of nodes

on a Storm cluster:

•

The Master node runs a daemon called Nimbus (similar to Hadoop's

JobTracker), which is responsible for distributing code around the cluster,

assigning tasks to machines, and handling failures.

•

The Worker nodes run a daemon called the Supervisor . The supervisor lis-

tens for work assigned to its machine and starts or stops worker processes

as necessary based on what Nimbus has assigned to it.

Figure 12.7 illustrates the architecture of a Storm cluster. In a Storm cluster all

the interactions between Nimbus and the Supervisors are done through a ZooKeeper

cluster, an open-source configuration and synchronization service for large distrib-

uted systems. Both the Nimbus daemon and Supervisor daemons are fail-fast and

stateless, where all state is kept in ZooKeeper or on local disk. Communication

between workers living on the same host or on different machines is based on

ZeroMQ sockets* over which serialized java objects (representing tuples) are being

passed. Some of the feature of ZeroMQ include

•

Socket library that acts as a concurrency framework

•

Faster than TCP, for clustered products and supercomputing

•

Carries messages across inproc, IPC, TCP, and multicast

•

Asynch I/O for scalable multicore message-passing apps

•

Connect N-to-N via fanout, pubsub, pipeline, request-reply

* http://www.zeromq.org/.

Large Scale and Big Data: Processing and Management

Search WWH ::

Custom Search

Home