Database Reference
In-Depth Information
In the following lines of code, we can see that the approach to compute the most popular
product is the same as that in the Scala example. The extra code might seem complex, but
it is mostly related to the Java code required to create the anonymous functions (which we
have highlighted here). The actual functionality is the same:
// let's find our most popular product
// first we map the data to records of (product,
1)using a PairFunction
// and the Tuple2 class.
// then we call a reduceByKey operation with a
Function2, which is essentially the sum function
List<Tuple2<String, Integer>> pairs = data.map( new
PairFunction<String[], String, Integer>() {
@Override
public Tuple2<String, Integer> call(String[]
strings)throws Exception {
return new Tuple2(strings[1], 1);
}
} ).reduceByKey( new Function2<Integer, Integer,
Integer>() {
@Override
public Integer call(Integer integer, Integer
integer2)throws Exception {
return integer + integer2;
}
} ).collect();
// finally we sort the result. Note we need to create a
Comparator function,
// that reverses the sort order.
Collections.sort(pairs, new
Comparator<Tuple2<String,Integer>>() {
@Override
public int compare(Tuple2<String, Integer>
o1,Tuple2<String, Integer> o2) {
return -(o1._2() - o2._2());
}
});
String mostPopular = pairs.get(0)._1();
int purchases = pairs.get(0)._2();
Search WWH ::




Custom Search