Hardware Reference
In-Depth Information
poison bit on the result register. If that register is later touched by a regular in-
struction, the trap occurs then (as it should). However, if the result is never used,
the poison bit is eventually cleared and no harm is done.
4.6 EXAMPLES OF THE MICROARCHITECTURE LEVEL
In this section, we will show brief examples of three state-of-the-art proc-
essors, showing how they employ the concepts explored in this chapter. These will
of necessity be brief because real machines are enormously complex, containing
millions of gates. The examples are the same ones we have been using so far: Core
i7, the OMAP4430, and the ATmega168.
4.6.1 The Microarchitecture of the Core i7 CPU
On the outside, the Core i7 appears to be a traditional CISC machine, with
processors that support a huge and unwieldy instruction set supporting 8-, 16-, and
32-bit integer operations as well as 32-bit and 64-bit floating-point operations. It
has only eight visible registers per processor and no two of them are quite the
same. Instruction lengths vary from 1 to 17 bytes. In short, it is a legacy architec-
ture that seems to do everything wrong.
However, on the inside, the Core i7 contains a modern, lean-and-mean, deeply
pipelined RISC core that runs at an extremely fast clock rate that is likely to in-
crease in the years ahead. It is quite amazing how the Intel engineers managed to
build a state-of-the-art processor to implement an ancient architecture. In this sec-
tion we will look at the Core i7 microarchitecture to see how it works.
Overview of the Core i7's Sandy Bridge Microarchitecture
The Core i7 microarchitecture, called the Sandy Bridge microarchitecture, is a
significant refinement of the previous-generation Intel microarchitectures, includ-
ing the earlier P4 and P6. A rough overview of the Core i7 microarchitecture is
given in Fig. 4-46.
The Core i7 consists of four major subsections: the memory subsystem, the
front end, the out-of-order control, and the execution units. Let us examine these
one at a time starting at the upper left and going counterclockwise around the chip.
Each processor in the Core i7 contains a memory subsystem with a unified L2
(level 2) cache as well as the logic for accessing the L3 (level 3) cache. A single
large L3 cache is shared by all processors, and it is the last stop before leaving the
CPU chip and making the very long trip to external RAM over the memory bus.
The Core i7's L2 caches are 256 KB in size, and each is organized as an 8-way
 
 
 
Search WWH ::




Custom Search