Java Reference
In-Depth Information
Given the way problems are decomposed, the nature of the initial source is extremely import-
ant in influencing the performance of this decomposition. Intuitively, the ease with which we
can repeatedly split a data structure in half corresponds to how fast it can be operated upon.
Splitting in half also means that the values to be operated upon need to split equally.
We can split up common data sources from the core library into three main groups by per-
formance characteristics:
The good
An ArrayList , an array, or the IntStream.range constructor. These data sources all
support random access, which means they can be split up arbitrarily with ease.
The okay
The HashSet and TreeSet . You can't easily decompose these with perfect amounts of
balance, but most of the time it's possible to do so.
The bad
Some data structures just don't split well; for example, they may take O ( N ) time to de-
compose. Examples here include a LinkedList , which is computationally hard to split in
half. Also, Streams.iterate and BufferedReader.lines have unknown length at the
beginning, so it's pretty hard to estimate when to split these sources.
The influence of the initial data structure can be huge. To take an extreme example, bench-
marking a parallel sum over 10,000 integers revealed an ArrayList to be 10 times faster
than a LinkedList . This isn't to say that your business logic will exhibit the same perform-
ance characteristics, but it does demonstrate how influential these things can be. It's also far
more likely that data structures such as a LinkedList that have poor decompositions will
also be slower when run in parallel.
Ideally, once the streams framework has decomposed the problem into smaller chunks, we'll
be able to operate on each chunk in its own thread, with no further communication or conten-
tion between threads. Unfortunately, reality can get the way of the ideal at times!
When we're talking about the kinds of operations in our stream pipeline that let us operate on
chunks individually, we can differentiate between two types of stream operations: stateless
and stateful . Stateless operations need to maintain no concept of state over the whole opera-
tion; stateful operations have the overhead and constraint of maintaining state.
Search WWH ::




Custom Search