Advance Concepts in Storm - Real-time Analytics with Storm and Cassandra

Database Reference

In-Depth Information

Merge and join

The merges and joins APIs provide interfaces for merging and joining various streams to-

gether. This is possible using a variety of ways provided as follows:

• Merge : As the name suggests, merge merges two or more streams together and

emits the merged stream as the output field of the first stream:

myTridentTopology.merge(stream1,stream2,stream3);

• Join : This operation works as the traditional SQL join function, but with the

difference that it applies to small batches instead of entire infinite streams coming

out of the spout

For example, consider a join function where Stream 1 has fields such as ["key",

"val1", "val2"] and Stream 2 has ["x", "val1"] , and from these functions we

execute the following code:

myTridentTopology.join(stream1, new Fields("key"), stream2,

new Fields("x"), new Fields("key", "a", "b", "c"));

As a result, Stream 1 and Stream 2 would be joined using key and x , wherein key would

join the field for Stream 1 and x would join the field for Stream 2.

The output tuples emitted from the join would have the following:

• The list of all the join fields; in our case, it would be key from Stream 1 and x

from Stream 2.

• A list of all the fields that are not join fields from all the streams involved in the

join operation in the same order as they are passed to the join operation. In our

case, it's a and b respectively for val1 and val2 of Stream 1, and c for val1

from Stream 2 (note that this step also removes the ambiguity of field names if any

ambiguity is present within the stream, in our case val1 field was ambiguous

between both the streams).

When operations like join happen on streams that are being fed in the topology from differ-

ent spouts, the framework ensures that the spouts are synchronized with respect to batch

emission, so that every join computation can include tuples from a batch of each spout.

Search WWH ::

Custom Search

Home