but you want to utilize only 4 of them: the goal is to maximize the CPU usage of those four
Clearly, then, the maximum number of threads must be set to at least four. Granted, there are
threads in the JVM doing things other than processing these tasks, but these threads will al-
most never need an entire CPU. One exception is if a concurrent mode garbage collector is
being used as discussed in Chapter 5 —the background threads there must have enough CPU
to operate, lest they fall behind in processing the heap.
Does it help to have more than four threads? This is where the characteristics of the work-
load come into play. Take the simple case where the tasks are all compute-bound: they don't
make external network calls (e.g., to a database), nor do they have significant contention on
an internal lock. The stock price history batch program is such an application (when using a
mock entity manager): the data on the entities can be calculated completely in parallel.
Table 9-1 shows the performance of calculating the history of 10,000 mock stock entities us-
ing a thread pool set to use the given number of threads on a machine with four CPUs. With
only a single thread in the pool, 255.6 seconds are needed to calculate the data set; with four
threads, only 77 seconds are required. After that, a little more time is needed as additional
threads are added.
Table 9-1. Time required to calculate 10,000 mock price histories
Number of threads Seconds required Percent of baseline
If the tasks in the application were completely parallel, then the “Percent of baseline” column
would show 50% for two threads and 25% for four threads. Such completely linear scaling is
impossible to come by for a number of reasons: if nothing else, the threads must coordinate
among themselves to pick a task from the run queue (and in general, there is usually more
synchronization among the threads). By the time four threads are used, the system is con-