Parallel data processing and performance - Java 8 in Action: Lambdas, Streams, and Functional-Style Programming

Java Reference

In-Depth Information

 Invoking the join method on a task blocks the caller until the result produced by that task is ready.

For this reason, it's necessary to call it after the computation of both subtasks has been started.

Otherwise, you'll end up with a slower and more complex version of your original sequential algorithm

because every subtask will have to wait for the other one to complete before starting.

 The invoke method of a ForkJoinPool shouldn't be used from within a RecursiveTask . Instead,

you should always call the methods compute or fork directly; only sequential code should use

invoke to begin parallel computation.

 Calling the fork method on a subtask is the way to schedule it on the ForkJoinPool . It might seem

natural to invoke it on both the left and right subtasks, but this is less efficient than just directly

calling compute on one of them. Doing this allows you to reuse the same thread for one of the two

subtasks and avoid the overhead caused by the unnecessary allocation of a further task on the pool.

 Debugging a parallel computation using the fork/join framework can be tricky. In particular, it's

ordinarily quite common to browse a stack trace in your favorite IDE to discover the cause of a

problem, but this can't work with a fork-join computation because the call to compute occurs in a

different thread than the conceptual caller, which is the code that called fork .

 As you've discovered with parallel streams, you should never take for granted that a computation

using the fork/join framework on a multicore processor is faster than the sequential counterpart. We

already said that a task should be decomposable into several independent subtasks in order to be

parallelizable with a relevant performance gain. All of these subtasks should take longer to execute

than forking a new task; one idiom is to put I/O into one subtask and computation into another,

thereby overlapping computation with I/O. Moreover, you should consider other things when

comparing the performance of the sequential and parallel versions of the same algorithm. Like any

other Java code, the fork/join framework needs to be “warmed up,” or executed, a few times before

being optimized by the JIT compiler. This is why it's always important to run the program multiple

times before to measure its performance, as we did in our harness. Also be aware that optimizations

built into the compiler could unfairly give an advantage to the sequential version (for example, by

performing dead code analysis—removing a computation that's never used).

The fork/join splitting strategy deserves one last note: you must choose the criteria used to

decide if a given subtask should be further split or is small enough to be evaluated sequentially.

We give some hints about this in the next section.

7.2.3. Work stealing

In our ForkJoinSumCalculator example we decided to stop creating more subtasks when the

array of numbers to be summed contained at most 10,000 items. This is an arbitrary choice, but

in most cases it's difficult to find a good heuristic, other than trying to optimize it by making

several attempts with different inputs. In our test case, we started with an array of 10 million

items, meaning that the ForkJoinSumCalculator will fork at least 1,000 subtasks. This might

Search WWH ::

Custom Search

Home