Advance Concepts in Storm - Real-time Analytics with Storm and Cassandra

Database Reference

In-Depth Information

Building a Trident topology

Trident gives a batching edge to the Storm computation. It lets developers use the abstrac-

ted layer for computations over the Storm framework, giving the advantage of stateful pro-

cessing with high throughput for distributed queries.

Well the architecture of Trident is the same as Storm; it's built on top of Storm to abstract a

layer that adds the functionality of micro-batching and execution of SQL-like functions on

top of Storm.

For the sake of analogy, one can say that Trident is a lot like Pig for batch processing in

terms of concept. It has support for joins, aggregates, grouping, filters, functions, and so

on.

Trident has basic batch processing features such as consistent processing and execution of

process logic over the tuples exactly once.

Now to understand Trident and its working; let's look at a simple example.

The example we have picked up would achieve the following:

• Word count over the stream of sentences (a standard Storm word count kind of to-

pology)

• A query implementation to get the sum of counts for a set of listed words

Here is the code for dissection:

FixedBatchSpout myFixedspout = new FixedBatchSpout(new

Fields("sentence"), 3,

new Values("the basic storm topology do a great job"),

new Values("they get tremendous speed and guaranteed

processing"),

new Values("that too in a reliable manner "),

new Values("the new trident api over storm gets user more

features "),

new Values("it gets micro batching over storm "));

myFixedspout.setCycle(true);

Search WWH ::

Custom Search

Home