program. Bottleneck is a somewhat pejorative term that isn't really fair. After all, whichever
subsystem is the bottleneck is the one that's doing your work! There is also a general tendency to
want to "balance out" the work across the different subsystems, keeping them all busy all the time.
Once again, that's a bit inaccurate. Balancing the work is useful only if it helps your program run
In Figure 15-6 we show a representation of where a program is spending its time and where the
bottleneck is with respect to CPU, cache latency, and I/O latency. Each block represents how busy
that subsystem is during some period of time (say, 10 µs).
Figure 15-6. Performance Bottlenecks and Capacities of Programs
Black indicates a subsystem used at full capacity, white indicates zero usage. A black CPU is
never stalled for anything; the other subsystems are waiting for it to make requests. A black cache
indicates that the CPU is stalled, waiting for data at least some of the time, and the same for I/O.
Depending upon system design, it may or may not actually be possible for CPU and cache to be
busy simultaneously. (We show a system where there is overlap.) The solid white sections for
CPU 1 and 2 indicate that they are suffering contention, waiting for CPU 0 to release a lock.
Typically, we expect CPU and cache to take turns being the bottleneck, alternating very rapidly.
When I/O is the bottleneck, it will be so for extended periods of time (the latency on a disk read
runs on the order of 20 ms).
By definition, there must be a line of solid black from one end of our graph to the other. In some
sense, the more solid black in the CPU section, the more work is getting done. A typical subgoal
will be to maximize the amount of time that all the CPUs actually work. (The primary goal is to
make the program run fast. Normally, you expect that making more CPUs do more work will have
that effect.) Eliminating contention is a major factor in doing so.
Benchmarks and Repeatable Testing
Before you get into the details of optimizing your code, you need to be very clear on what your
starting point is and what your objective is. Your overall objective is to make the entire system run
faster. Perhaps you have a specific target (you need 13.5% improvement to beat the competition);
Search WWH :