Database Reference
In-Depth Information
Processing Data with Storm
Storm is a stream-processing framework developed by a company called
BackType, which was a marketing analytics company that was analyzing
Twitter data. Twitter acquired BackType in 2011, and Storm, a core piece
of the BackType technology, was adopted and eventually open-sourced by
Twitter. Storm 0.9.0 is the current version and is the focus of this section
because there are some new features that simplify Storm's use cases.
Storm's focus is the development and deploying of fault-tolerant distributed
stream computations. To do this, it provides a framework for hosting
applications. It also provides two models for building applications.
The first is “classic” Storm, which frames the application as a directed
acyclic graph (DAG) called a topology. The top of this graph takes inputs
from the outside world, such as Kafka. These data sources are called spouts .
These spouts then pass the data onto units of computation called bolts .
The second model for building applications, called Trident, is a higher-level
abstraction on top of the topology. This model is more focused on common
operations like aggregation and persistence. It provides primitives focused
onthistypeofcomputation,whichisarrangedaccordingtotheirprocessing.
Trident then computes a topology by combining and splitting operations
into appropriate collections of bolts.
This chapter covers both models of computation. The second model, using
the Trident domain-specific language, is generally preferred because it
standardizes a number of common stream processing tasks. However, it
is sometimes advantageous to break with this common model and use the
underlying topology framework.
Components of a Storm Cluster
When the time comes to deploy a Storm topology in production, you must
construct a cluster. This cluster consists of three server daemons, all of
which must be present to function properly: ZooKeeper, the nimbus, and
the supervisors. In this section, a Storm cluster capable of supporting
production workloads will be constructed using these three components as
shown in Figure 5.1 .
 
Search WWH ::




Custom Search