Hardware Reference
In-Depth Information
FIGURE 3.40 The performance ratio for the A9 compared to the A8, both using a 1 GHz
clock and the same size caches for L1 and L2, shows that the A9 is about 1.28 times
faster . Both runs use a 32 KB primary cache and a 1 MB secondary cache, which is 8-way
set associative for the A8 and 16-way for the A9. The block sizes in the caches are 64 bytes
for the A8 and 32 bytes for the A9. As mentioned in the caption of Figure 3.39 , eon makes in-
tensive use of integer multiply, and the combination of dynamic scheduling and a faster mul-
tiply pipeline significantly improves performance on the A9. twolf experiences a small slow-
down, likely due to the fact that its cache behavior is worse with the smaller L1 block size of
the A9.
The Intel Core I7
The i7 uses an aggressive out-of-order speculative microarchitecture with reasonably deep
pipelines with the goal of achieving high instruction throughput by combining multiple issue
and high clock rates. Figure 3.41 shows the overall structure of the i7 pipeline. We will exam-
ine the pipeline by starting with instruction fetch and continuing on to instruction commit,
following steps labeled on the figure.
 
Search WWH ::




Custom Search