Database Reference
In-Depth Information
batch to a single partition. Different batches may be sent to a different
partition.
• There is also a partition function that takes an implementation of
CustomStreamGrouping to implement custom partition methods.
In addition to these partition methods, Trident also implements a special
type of partition operation called groupBy . This partition operation acts
very much like the reducer phase in a Map-Reduce job and is probably the
most commonly used partition in Trident.
The groupBy operation first applies a partitionBy partition according to
the fields specified in the groupBy operation. Within each partition, values
with identical hash values are then grouped together for further processing
in a GroupedStream . Aggregators can then be applied directly to these
groups.
Aggregation
Trident has two methods of aggregation on streams, aggregate and
persistent-Aggregate . The aggregate method applied to a stream
with a function that implements ReducerAggregator or Aggregator
effectively performs a global partition on the data, causing all tuples to be
sent tothesame partition foraggregation onaper-batch basis. An aggregate
method that is given a CombinerAggregator first computes an
intermediate aggregate for each partition and then performs a global
partition on the output of these intermediate aggregates.
Thebuilt-inaggregators Count and Sum areboth CombinerAggregators .
Other custom aggregation methods can be implemented by extending the
BaseAggregator class. An example of this is shown in the next section
when a partition local aggregator is implemented.
The other aggregation method, persistentAggregate , works like an
aggregate except that it records its output into a State object. These State
objects often work with the Spout in a Trident topology to ensure features
like transactional semantics.
The persistentAggregate method returns a TridentState object
rather than a Stream object. This object has a method newValuesStream
that emits a tuple every time a key's state changes. This is useful for
Search WWH ::




Custom Search