Database Reference
In-Depth Information
}
});
assertEquals ( "{(5,apple),(6,banana;cherry)}" , dump ( d ));
NOTE
String concatenation is not commutative, so the result is not deterministic. This may or may not be im-
portant in your application!
The code is cluttered somewhat by the use of Pair objects in the process() method
signature; they have to be unwrapped with calls to first() and second() , and a new
Pair object is created to emit the new key-value pair. This combining function does not
alter the key, so we can use an overloaded form of combineValues() that takes an
Aggregator object for operating only on the values and passes the keys through un-
changed. Even better, we can use a built-in Aggregator implementation for performing
string concatenation found in the Aggregators class. The code becomes:
PTable < Integer , String > e =
c . combineValues ( Aggregators . STRING_CONCAT ( ";" ,
false ));
assertEquals ( "{(5,apple),(6,banana;cherry)}" , dump ( e ));
Sometimes you may want to aggregate the values in a PGroupedTable and return a
result with a different type from the values being grouped. This can be achieved using the
mapValues() method with a MapFn for converting the iterable collection into another
object. For example, the following calculates the number of values for each key:
PTable < Integer , Integer > f = c . mapValues ( new MapFn < Iterable < String >,
Integer >() {
@Override
public Integer map ( Iterable < String > input ) {
return Iterables . size ( input );
}
}, ints ());
assertEquals ( "{(5,1),(6,2)}" , dump ( f ));
Notice that the values are strings, but the result of applying the map function is an integer,
the size of the iterable collection computed using Guava's Iterables class.
You might wonder why the combineValues() operation exists at all, given that the
mapValues() method is more powerful. The reason is that combineValues() can
be run as a MapReduce combiner, and therefore it can improve performance by being run
on the map side, which has the effect of reducing the amount of data that has to be trans-
Search WWH ::




Custom Search