Instruction-Level Parallelism and Its Exploitation - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

least a different value from a set of values), value prediction can have only limited success.

There are, however, certain instructions for which it is easier to predict the resulting

value—for example, loads that load from a constant pool or that load a value that changes in-

frequently. In addition, when an instruction produces a value chosen from a small set of po-

tential values, it may be possible to predict the resulting value by correlating it with other pro-

gram behavior.

Value prediction is useful if it significantly increases the amount of available ILP. This pos-

sibility is most likely when a value is used as the source of a chain of dependent computations,

such as a load. Because value prediction is used to enhance speculations and incorrect specu-

lation has detrimental performance impact, the accuracy of the prediction is critical.

Although many researchers have focused on value prediction in the past ten years, the res-

ults have never been suiciently atractive to justify their incorporation in real processors. In-

stead, a simpler and older idea, related to value prediction, has been used: address aliasing

prediction. Address aliasing prediction is a simple technique that predicts whether two stores or

a load and a store refer to the same memory address. If two such references do not refer to

the same address, then they may be safely interchanged. Otherwise, we must wait until the

memory addresses accessed by the instructions are known. Because we need not actually pre-

dict the address values, only whether such values conflict, the prediction is both more stable

and simpler. This limited form of address value speculation has been used in several pro-

cessors already and may become universal in the future.

3.10 Studies of the Limitations of ILP

Exploiting ILP to increase performance began with the first pipelined processors in the 1960s.

In the 1980s and 1990s, these techniques were key to achieving rapid performance improve-

ments. The question of how much ILP exists was critical to our long-term ability to enhance

performance at a rate that exceeds the increase in speed of the base integrated circuit techno-

logy. On a shorter scale, the critical question of what is needed to exploit more ILP is crucial to

both computer designers and compiler writers. The data in this section also provide us with a

way to examine the value of ideas that we have introduced in this chapter, including memory

disambiguation, register renaming, and speculation.

In this section we review a portion of one of the studies done of these questions (based on

Wall's 1993 study). All of these studies of available parallelism operate by making a set of as-

sumptions and seeing how much parallelism is available under those assumptions. The data

we examine here are from a study that makes the fewest assumptions; in fact, the ultimate

hardware model is probably unrealizable. Nonetheless, all such studies assume a certain level

of compiler technology, and some of these assumptions could affect the results, despite the

use of incredibly ambitious hardware.

As we will see, for hardware models that have reasonable cost, it is unlikely that the costs of

very aggressive speculation can be justified: the inefficiencies in power and use of silicon are

simply too high. While many in the research community and the major processor manufactur-

ers were beting in favor of much greater exploitable ILP and were initially reluctant to accept

this possibility, by 2005 they were forced to change their minds.

Search WWH ::

Custom Search

Home