Java Reference
In-Depth Information
burning cycles with useless computation; we want to keep the CPUs busy with
useful
work.)
If the program is compute-bound, then we may be able to increase its capacity by adding
more processors; if it can't even keep the processors we have busy, adding more won't help.
Threading offers a means to keep the CPU(s) “hotter” by decomposing the application so
there is always work to be done by an available processor.
11.1.1. Performance Versus Scalability
Application performance can be measured in a number of ways, such as service time, latency,
throughput, efficiency, scalability, or capacity. Some of these (service time, latency) aremeas-
ures of “how fast” a given unit of work can be processed or acknowledged; others (capacity,
throughput) are measures of “how much” work can be performed with a given quantity of
computing resources.
Scalability
describes the ability to improve throughput or capacity when additional comput-
ing resources (such as additional CPUs, memory, storage, or I/O bandwidth) are added.
Designing and tuning concurrent applications for scalability can be very different from tradi-
tional performance optimization. When tuning for performance, the goal is usually to do the
same
work with
less
effort, such as by reusing previously computed results through caching
or replacing an
O
(
n
2
) algorithm with an
O
(
n
log
n
) one. When tuning for scalability, you are
instead trying to find ways to parallelize the problem so you can take advantage of additional
processing resources to do
more
work with
more
resources.
These two aspects of performance—
how fast
and
how much
—are completely separate, and
sometimes even at odds with each other. In order to achieve higher scalability or better hard-
ware utilization, we often end up
increasing
the amount of work done to process each
indi-
vidual
task, such as when we divide tasks into multiple “pipelined” subtasks. Ironically, many
of the tricks that improve performance in single-threaded programs are bad for scalability
(see
Section 11.4.4
for an example).
The familiar three-tier application model—in which presentation, business logic, and persist-
ence are separated and may be handled by different systems—illustrates how improvements
in scalability often come at the expense of performance. A monolithic application where
presentation, business logic, and persistence are intertwined would almost certainly provide
better performance for the
first
unit of work than would a well-factored multitier implement-
ation distributed over multiple systems. How could it not? The monolithic application would
not have the network latency inherent in handing off tasks between tiers, nor would it have