Hardware Reference
In-Depth Information
1. Increasing the clock speed.
2. Putting two CPUs on a chip.
3. Adding functional units.
4. Making the pipeline longer.
5. Using multithreading.
An obvious way to improve performance is to increase the clock speed without
changing anything else. Doing this is relatively straightforward and well under-
stood, so each new chip that comes out is generally slightly faster than its prede-
cessor. Unfortunately, a faster clock also has two main drawbacks that limit how
great an increase can be tolerated. First, a faster clock uses more energy, which is
a huge problem for notebook computers and other battery-powered devices. Sec-
ond, the extra energy input means the chip gets hotter and there is more heat to dis-
sipate.
Putting two CPUs on a chip is relatively straightforward, but it comes close to
doubling the chip area if each one has its own caches and thus reduces the number
of chips per wafer by a factor of two, which essentially doubles the unit manufac-
turing cost. If the two chips share a common cache as big as the original one, the
chip area is not doubled, but cache size per CPU is halved, cutting into per-
formance. Also, while high-end server applications can often fully utilize multiple
CPUs, not all desktop applications have enough inherent parallelism to warrant
two full CPUs.
Adding additional functional units is also fairly easy, but it is important to get
the balance right. Having 10 ALUs does little good if the chip is incapable of feed-
ing instructions into the pipeline fast enough to keep them all busy.
A longer pipeline with more stages, each doing a smaller piece of work in a
shorter time period increases performance but also increases the negative effects of
branch mispredictions, cache misses, interrupts, and other factors that interrupt
normal pipeline flow. Furthermore, to take full advantage of a longer pipeline, the
clock speed has to be increased, which means more energy is consumed and more
heat is produced.
Finally, multithreading can be added. Its value is in having a second thread
utilize hardware that would otherwise have lain fallow. After some experimenta-
tion, it became clear that a 5% increase in chip area for multithreading support
gave a 25% performance gain in many applications, making this a good choice.
Intel's first multithreaded CPU was the Xeon in 2002, but multithreading was later
added to the Pentium 4, starting with the 3.06-GHz version and continuing with
faster versions of the Pentium processor, including the Core i7. Intel calls the im-
plementation of multithreading used in its processors hyperthreading .
The basic idea is to allow two threads (or possibly processes, since the CPU
cannot tell what is a thread and what is a process) to run at once. To the operating
Search WWH ::




Custom Search