Hardware Reference
In-Depth Information
instructions in the same number of cycles. The same thing is true for a Pentium; it executes about
twice as many instructions in a given number of cycles as a 486. Therefore, given the same clock
speed, a Pentium is twice as fast as a 486, and consequently a 133MHz 486 class processor (such as
the AMD 5x86-133) is not even as fast as a 75MHz Pentium! That is because Pentium megahertz are
“worth” about double what 486 megahertz are worth in terms of instructions completed per cycle.
The Pentium II and III are about 50% faster than an equivalent Pentium at a given clock speed because
they can execute about that many more instructions in the same number of cycles.
Unfortunately, after the Pentium III, it becomes much more difficult to compare processors on clock
speed alone. This is because the different internal architectures make some processors more efficient
than others, but these same efficiency differences result in circuitry that is capable of running at
different maximum speeds. The less efficient the circuit, the higher the clock speed it can attain, and
vice versa. Another difference is that some of the later processors include varying sizes of L2 and L3
cache.
One of the biggest factors in efficiency is the number of stages in the processor's internal pipeline
(see Table 3.6 ).
Table 3.6. Number of Pipelines per CPU
A deeper pipeline effectively breaks down instructions into smaller microsteps, which allows overall
higher clock rates to be achieved using the same silicon technology. However, this also means that
overall fewer instructions can be executed in a single cycle as compared to processors with shorter
pipelines. This is because, if a branch prediction or speculative execution step fails (which happens
fairly frequently inside the processor as it attempts to line up instructions in advance), the entire
pipeline has to be flushed and refilled. Thus, if you compared an Intel Core i7 or AMD FX to a
Pentium 4 running at the same clock speed, the Core i7 and FX would execute more instructions in the
same number of cycles.
Although it is a disadvantage to have a deeper pipeline in terms of instruction efficiency, processors
with deeper pipelines can run at higher clock rates on a given manufacturing technology. Thus, even
though a deeper pipeline might be less efficient, the higher resulting clock speeds can make up for it.
The deeper 20- or 31-stage pipeline in the P4 architecture enabled significantly higher clock speeds
to be achieved using the same silicon die process as other chips. As an example, the 0.13-micron
process Pentium 4 ran up to 3.4GHz, whereas the Athlon XP topped out at 2.2GHz (3200+ model) in
the same introduction timeframe. Even though the Pentium 4 executes fewer instructions in each cycle,
the overall higher cycling speeds made up for the loss of efficiency; the higher clock speed versus the
more efficient processing effectively cancelled each other out.
Unfortunately, the deep pipeline combined with high clock rates did come with a penalty in power
 
Search WWH ::




Custom Search