Database Reference
In-Depth Information
• The StreamSystemPartition object, accessible via the
getSystemStreamPartition method. This contains metadata about
the message, such as its originating partition and offset information.
Most tasks do not need to access this information directly.
The SystemProducer is usually accessed through a SystemStream
defined as a static final member of the task's class. For example, to define
a Kafka output stream on the topic “ output ” add the following line to the
class definition:
private static final SystemStream OUTPUT =
new SystemStream("kafka","output");
This SystemStream object is used to initialize an
OutgoingMessageEnvelope along with the message payload and an
optional key payload. This object is then sent to the MessageCollector ,
which manages placing the outgoing message on the specified stream.
Implementing a Stream Task
The task itself is defined by the StreamTask interface, which is implemented
by all task classes. This interface has a single process method, which takes
IncomingMessageEnvelope , MessageCollector , and
TaskCoordinator as arguments. The IncomingMessageEnvelope is
a message from the task.inputs streams as described in the previous
section, whereas the MessageCollector is used to emit events to output
streams, which are defined within the class.
For example, this task handles the task of splitting the input of the
wikipedia-raw topic into words, the same way that words were split up
in the Trident example earlier in this chapter. In this case the JSON SerDe
is being used to parse the incoming data (which is in JSON) and sent to the
wikipedia-words topic.
public class WordSplitTask implements StreamTask {
private static final SystemStream OUTPUT_STREAM
= new SystemStream("kafka","wikipedia-words");
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) throws Exception {
Search WWH ::




Custom Search