The automation must gather every conceivable piece of data that will be useful for later
analysis. This includes system information sampled throughout the run: CPU usage, disk
usage, network usage, memory usage, and so on. It includes logs from the applica-
tion—both those the application produces, and the logs from the garbage collector.
impact profiling information, periodic thread stacks, and other heap analysis data like
histograms or full heap dumps (though the full heap dumps, in particular, take a lot of
space and cannot necessarily be kept long term).
The monitoring information must also include data from other parts of the system, if ap-
plicable: for example, if the program uses a database, then include the system statistics
from the database machine as well as any diagnostic output from the database (including
performance reports like Oracle's Automatic Workload Repository [AWR] reports).
This data will guide the analysis of any regressions that are uncovered. If the CPU usage
has increased, it's time to consult the profile information to see what is taking more time.
If the time spent in GC has increased, it's time to consult the heap profiles to see what is
consuming more memory. If CPU time and GC time have decreased, contention some-
where has likely slowed down performance: stack data can point to particular synchron-
ization bottlenecks (see Chapter 9 ), JFR recordings can be used to find application laten-
cies, or database logs can point to something that has increased database contention.
When figuring out the source of a regression, it is time to play detective, and the more
data that is available, the more clues there are to follow. As discussed in Chapter 1 , it
isn't necessarily the case that the JVM is the regression. Measure everything, every-
where, to make sure the correct analysis can be done.
Run on the target system
A test that is run on a single-core laptop will behave very differently than a test run on a
machine with a 256-thread SPARC CPU. That should be clear in terms of threading ef-
fects: the larger machine is going to run more threads at the same time, reducing conten-
tion among application threads for access to the CPU. At the same time, the large system
will show synchronization bottlenecks that would be unnoticed on the small laptop.
There are other performance differences that are just as important even if they are not as
immediately obvious. Many important tuning flags calculate their defaults based on the
underlying hardware the JVM is running on. Code is compiled differently from platform