Instruction-Level Parallelism and Its Exploitation - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

FIGURE 3.24 Prediction accuracy for a return address buffer operated as a stack on a

number of SPEC CPU95 benchmarks . The accuracy is the fraction of return addresses pre-

dicted correctly. A buffer of 0 entries implies that the standard branch prediction is used. Since

call depths are typically not large, with some exceptions, a modest buffer works well. These

data come from Skadron et al. [1999] and use a fix-up mechanism to prevent corruption of the

cached return addresses.

Integrated Instruction Fetch Units

To meet the demands of multiple-issue processors, many recent designers have chosen to im-

plement an integrated instruction fetch unit as a separate autonomous unit that feeds instruc-

tions to the rest of the pipeline. Essentially, this amounts to recognizing that characterizing

instruction fetch as a simple single pipe stage given the complexities of multiple issue is no

longer valid.

Instead, recent designs have used an integrated instruction fetch unit that integrates several

functions:

1. Integrated branch prediction —The branch predictor becomes part of the instruction fetch unit

and is constantly predicting branches, so as to drive the fetch pipeline.

2. Instruction prefetch —To deliver multiple instructions per clock, the instruction fetch unit

will likely need to fetch ahead. The unit autonomously manages the prefetching of instruc-

tions (see Chapter 2 for a discussion of techniques for doing this), integrating it with branch

prediction.

Search WWH ::

Custom Search

Home