Performance and Scalability - Java Concurrency in Practice

Java Reference

In-Depth Information

burning cycles with useless computation; we want to keep the CPUs busy with useful work.)

If the program is compute-bound, then we may be able to increase its capacity by adding

more processors; if it can't even keep the processors we have busy, adding more won't help.

Threading offers a means to keep the CPU(s) “hotter” by decomposing the application so

there is always work to be done by an available processor.

11.1.1. Performance Versus Scalability

Application performance can be measured in a number of ways, such as service time, latency,

throughput, efficiency, scalability, or capacity. Some of these (service time, latency) aremeas-

ures of “how fast” a given unit of work can be processed or acknowledged; others (capacity,

throughput) are measures of “how much” work can be performed with a given quantity of

computing resources.

Scalability describes the ability to improve throughput or capacity when additional comput-

ing resources (such as additional CPUs, memory, storage, or I/O bandwidth) are added.

Designing and tuning concurrent applications for scalability can be very different from tradi-

tional performance optimization. When tuning for performance, the goal is usually to do the

same work with less effort, such as by reusing previously computed results through caching

or replacing an O ( n 2 ) algorithm with an O ( n log n ) one. When tuning for scalability, you are

instead trying to find ways to parallelize the problem so you can take advantage of additional

processing resources to do more work with more resources.

These two aspects of performance— how fast and how much —are completely separate, and

sometimes even at odds with each other. In order to achieve higher scalability or better hard-

ware utilization, we often end up increasing the amount of work done to process each indi-

vidual task, such as when we divide tasks into multiple “pipelined” subtasks. Ironically, many

of the tricks that improve performance in single-threaded programs are bad for scalability

(see Section 11.4.4 for an example).

The familiar three-tier application model—in which presentation, business logic, and persist-

ence are separated and may be handled by different systems—illustrates how improvements

in scalability often come at the expense of performance. A monolithic application where

presentation, business logic, and persistence are intertwined would almost certainly provide

better performance for the first unit of work than would a well-factored multitier implement-

ation distributed over multiple systems. How could it not? The monolithic application would

not have the network latency inherent in handing off tasks between tiers, nor would it have

Search WWH ::

Custom Search

Home