Database Reference
In-Depth Information
builder.setBolt("processing",
new
BasicBolt())
.fieldsGrouping("input",
new
Fields("key1","key2"));
All and Global Groupings
The
allGrouping
and
globalGrouping
methods are exact opposites of
each other. The
allGrouping
method ensures that all tuples in an event
stream are transmitted to all running tasks for a particular bolt. This is
occasionally useful, but it multiplies the number of events by the number of
running tasks, so use it with care.
The
globlaGrouping
method does the opposite. It ensures that all tuples
from a stream go to the bolt with the lowest numbered identifier. All bolt
tasks receive an identifier number, but this ensures only one of them will
be used. This should also be used with care because it effectively disables
Storm's parallelism.
Direct Groupings
The
directGrouping
method is a special form of grouping that allows
the output
Bolt
or
Spout
to decide which
Bolt
task receives a particular
Tuple
. This requires that the stream be declared to be direct when the
bolt is created. The producer bolt must also use the
emitDirect
method
instead of the
emit
method.
The
emitDirect
method takes an extra parameter that identifies the
destination
Bolt
. The list of valid identifiers can be obtained from the
TopologyContext
when implementing a
Bolt
. This is shown in greater
detail in the section that details implementing
Bolts
.
Custom Groupings
When all else fails, it is also possible to implement a custom grouping
method. To do this, create a class that implements the
CustomStreamGrouping
interface.
This interface contains two methods. The first method,
prepare
, is called
when the topology is instantiated and lets the grouping method assemble
any metadata it may need to perform its tasks. In particular, this method
receives the
targetTasks
variable, which is the list of Bolts that subscribe
to the stream being grouped.