Hardware Reference
In-Depth Information
A deeper pipeline effectively breaks down instructions into smaller microsteps, which
allows overall higher clock rates to be achieved using the same silicon technology.
However, this also means that overall fewer instructions can be executed in a single cycle
as compared to processors with shorter pipelines. This is because, if a branch prediction
or speculative execution step fails (which happens fairly frequently inside the processor
as it attempts to line up instructions in advance), the entire pipeline has to be flushed and
refilled. Thus, if you compared an Intel Core i7 or AMD Phenom to a Pentium 4 running
at the same clock speed, the Core i7 and Phenom would execute more instructions in the
same number of cycles.
Although it is a disadvantage to have a deeper pipeline in terms of instruction efficiency,
processors with deeper pipelines can run at higher clock rates on a given manufacturing
technology. Thus, even though a deeper pipeline might be less efficient, the higher res-
ulting clock speeds can make up for it. The deeper 20- or 31-stage pipeline in the P4 ar-
chitecture enabled significantly higher clock speeds to be achieved using the same silicon
die process as other chips. As an example, the 0.13-micron process Pentium 4 ran up to
3.4GHz, whereas the Athlon XP topped out at 2.2GHz (3200+ model) in the same intro-
duction timeframe. Even though the Pentium 4 executes fewer instructions in each cycle,
theoverallhighercyclingspeedsmadeupforthelossofefficiency;thehigherclockspeed
versus the more efficient processing effectively cancelled each other out.
Unfortunately, the deep pipeline combined with high clock rates did come with a penalty
in power consumption, and therefore heat generation as well. Eventually it was determ-
ined that the power penalty was too great, causing Intel to drop back to a more efficient
design in its newer Core microarchitecture processors. Rather than solely increase clock
rates, performance was increased by combining multiple processors into a single chip,
thus improving the effective instruction efficiency even further. This began the push to-
ward multicore processors.
One thing is clear in all of this confusion: Raw clock speed is not a good way to compare
chips, unless they are from the same manufacturer, model, and family.
To fairly compare various CPUs at different clock speeds, Intel originally devised a spe-
cific series of benchmarks called the Intel Comparative Microprocessor Performance
(iCOMP) index. The iCOMP index benchmark was released in original iCOMP, iCOMP
2.0, and iCOMP 3.0 versions.
The iCOMP 2.0index wasderived fromseveral independent benchmarks asanindication
of relative processor performance. The benchmarks balance integer with floating-point
and multimedia performance.
Search WWH ::




Custom Search