Java Reference
In-Depth Information
such cycles during the 5-minute execution of a program, then the pauses have added a 3.4%
performance penalty: without the pauses, the program would have completed in 290 rather
than 300 seconds.
If extra CPU is available (and that might be a big if), then using a concurrent collector will
give the application a nice performance boost. The key here is whether adequate CPU is
available for the background processing of the concurrent GC threads. Take the simple case
of a single-CPU machine where there is a single application thread that consumes 100% of
the CPU. When that application is run with the throughput collector, then GC will periodic-
ally run, causing the application thread to pause. When the same application is run with a
concurrent collector, the operating system will sometimes run the application thread on the
CPU, and sometimes run the background GC thread. The net effect is the same: the applica-
tion thread is effectively paused (albeit for much shorter times) while the OS is running other
The same principle applies in the general case when there are multiple application threads,
multiple background GC threads, and multiple CPUs. If the operating system can't run all
the application threads at the same time as the background GC threads, then the competition
for the CPU has effectively introduced pauses into the behavior of the application threads.
Table 5-1 shows how this trade-off works. The batch application calculating stock data has
been run in a mode that saves each set of results in memory for a few minutes (to fill up the
heap); the test was run with a CMS and throughput GC algorithm.
Table 5-1. Batch processing time with different GC algorithms
GC algorithm 4 CPUs (CPU utilization) 1 CPU (CPU utilization)
78.09 (30.7%)
120.0 (100%)
81.00 (27.7%)
111.6 (100%)
The times in this table are the number of seconds required to run the test, and the CPU utiliz-
ation of the machine is shown in parentheses. When four CPUs are available, CMS runs a
batch of operations about 3 seconds faster than the throughput collector—but notice the
amount of CPU utilized in each case. There is a single application thread that will continu-
ally run, so with four CPUs that application thread will consume 25% of the available CPU.
The extra CPU reported in the table comes from the extra processing introduced by the GC
threads. In the case of CMS, the background thread periodically consumes an entire CPU, or
Search WWH ::

Custom Search