Database Reference
In-Depth Information
Example 3-31. flatMap() in Java, splitting lines into multiple words
JavaRDD < String > lines = sc . parallelize ( Arrays . asList ( "hello world" , "hi" ));
JavaRDD < String > words = lines . flatMap ( new FlatMapFunction < String , String >() {
public Iterable < String > call ( String line ) {
return Arrays . asList ( line . split ( " " ));
}
});
words . first (); // returns "hello"
We illustrate the difference between flatMap() and map() in Figure 3-3 . You can
think of flatMap() as “flattening” the iterators returned to it, so that instead of end‐
ing up with an RDD of lists we have an RDD of the elements in those lists.
Figure 3-3. Difference between flatMap() and map() on an RDD
Pseudo set operations
RDDs support many of the operations of mathematical sets, such as union and inter‐
section, even when the RDDs themselves are not properly sets. Four operations are
shown in Figure 3-4 . It's important to note that all of these operations require that
the RDDs being operated on are of the same type.
Figure 3-4. Some simple set operations
 
Search WWH ::




Custom Search