Database Reference
In-Depth Information
Storm topology wired to the Cassandra
store
Now you have been educated and informed about why you should use Cassandra. You have
been walked through setting up Cassandra and column family creation, and have even
covered the various client/protocol options available to access the Cassandra data store pro-
grammatically. As mentioned earlier, Hector has so far been the most widely used API for
accessing Cassandra, though the Datastax and Astyanax drivers are fast catching up.
For our exercise, we'll use the Hector API.
The use case we want to implement here is to use Cassandra to support real-time, adhoc re-
porting for telecom data that is being collated, parsed, and enriched using a Storm topo-
logy.
As depicted in the preceding figure, the use case requires live telecom Call Detail Record
( CDR ) capture using the data collection components (for practice, we can use sample re-
cords and a simulator shell script to mimic the live CDR feeds). The collated live feed is
pushed into the RabbitMQ broker and then consumed by the Storm topology.
For the topology, we have an AMQP spout as the consumer, which reads the data of the
queue and pushes it downstream to the topology bolts; here, we have wired in bolts to parse
the message and convert it to Plain Old Java Objects ( POJO 's). Then, we have a new
entry in our topology, the Cassandra bolt, which actually stores the data in the Cassandra
cluster.
Search WWH ::




Custom Search