Database Reference
In-Depth Information
Example Code
In this example, we're going to build a topology that reads comma-delimited reviews from a
ReviewSpout and keeps track of the number of times each title is reviewed. Defining a Storm
topology can get a little involved, so we'll just cover the highlights.
The first step of defining a topology is to define our inputs. We do this by associating a spout
with our topology. This spout will be responsible for reading data from some source, such as
a Twitter or an RSS feed.
Once we have our spout defined, we can start defining bolts. Bolts are responsible for pro-
cessing our data. In this case, we have two bolts—the first extracts the movie title from a re-
view, and the second counts the number of times an individual title appears:
TopologyBuilder builder = new
new TopologyBuilder ();
builder . setSpout ( "review_spout" , new
new ReviewSpout (), 10 );
builder . setBolt ( "extract_title" , new
new TitleBolt (), 8 );
builder . setBolt ( "count" , new
new TitleCount (), 15 );
//Build the "conf" object and configure it appropriately
// for your job
...
StormSubmitter . submitTopology ( "review_counter" , conf ,
builder . createTopology ());
Spouts and bolts can be authored in a variety of languages, and you can even mix languages
in an individual topology. For example, we authored our topology in Java, but we're going to
write one of our bolts in Python. This bolt extracts the film title from a review by splitting
the review on commas and retrieving the second field:
import
import storm
storm
class
class TitleBolt
TitleBolt ( storm . BasicBolt ):
def
def process ( self , tuple ):
words = tuple . values [ 0 ] . split ( "," )
storm . emit ([ words [ 1 ]])
TitleBolt () . run ()
Search WWH ::




Custom Search