Database Reference
In-Depth Information
To use this in a topology, a KafkaSpout is created and then used like any
other spout. This simple test topology reports on the events being produced
on the Wikipedia-raw topic:
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("kafka", new KafkaSpout(config), 10);
builder.setBolt("print", new LoggerBolt())
.shuffleGrouping("kafka");
Config conf = new Config();
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("example", conf,
builder.createTopology());
//Sleep for a minute
Thread. sleep (60000);
Connecting Storm to Flume
There is a fundamental “impedance mismatch” between Storm and Flume.
Storm is a polling consumer, assuming a pull model. Flume's fundamental
sink model is push based. Although some attempts have been made, there is
no accepted mechanism for directly connecting Flume and Storm.
There are solutions to this problem, but all of them involve adding another
queuing system or data motion system into the mix. There are two basic
approaches that are used “in the wild.” The first is to use queuing systems
that are compatible with the Advanced Message Queuing Protocol (AMQP).
For Flume applications, the RabbitMQ queuing system is popular because
there exists a Flume sink developed by Kenshoo ( http://github.com/
kenshoo/flume-rabbitmq-sink ). Storm's AMQPSpout is then used to read
from RabbitMQ.
The other option is to use a Kafka sink for Flume and then integrate Storm
with Kafka as described in the previous section. At this point the only
real reason to use Flume is to integrate with an existing infrastructure.
YoucanfindtheFlume/Kafka sinkat https://github.com/baniuyao/
flume-ng-kafka-sink.
Distributed Remote Procedure Calls
In addition to the development of standard processing topologies that
consumedatafromastream,Stormalsoincludesfacilitiesforimplementing
Search WWH ::




Custom Search