THE INSTRUCTION SET ARCHITECTURE LEVEL - Structured Computer Organization

Hardware Reference

In-Depth Information

5.8.6 Speculative Loads

Another feature of the IA-64 that speeds up execution is the presence of specu-

lative LOAD s. If a LOAD is speculative and it fails, instead of causing an exception,

it just stops and a bit associated with the register to be loaded is set marking the

register as invalid. This is just the poison bit introduced in Chap. 4. If it turns out

that the poisoned register is later used, the exception occurs at that time; otherwise,

it never happens.

The way speculation is normally used is for the compiler to hoist LOAD sto

positions before they are needed. By starting early, they may be finished before the

results are needed. At the place where the compiler needs to use the register just

loaded, it inserts a CHECK instruction. If the value is there, CHECK acts like a NOP

and execution continues immediately. If the value is not there yet, the next instruc-

tion must stall. If an exception occurred and the poison bit is on, the pending ex-

ception occurs at that point.

In summary, a machine implementing the IA-64 architecture gets its speed

from several different sources. At the core is a state-of-the-art pipelined,

load/store, three-address RISC engine. That is already a big improvement over the

overly complex IA-32 architecture.

In addition, the IA-64 has a model of explicit parallelism that requires the

compiler to figure out which instructions can be executed at the same time without

conflicts and group them together in bundles. In this way the CPU can just blindly

schedule a bundle without having to do any heavy-duty thinking. Moving work

from run time to compile time is always a win.

Next, predication allows the statements in both branches of an if statement to

be merged together in a single stream, eliminating the conditional branch and thus

the prediction of which way it will go. Finally, speculative LOAD s make it possible

to fetch operands in advance, without penalty if it turns out later that they are not

needed after all.

All in all, the Itanium architecture is an impressive design that appears to better

serve architects and users. So, are you running an Itanium processor in your com-

puter, are we running one in ours, is your mom running one, do you know someone

that is running one? Answer: no, no, no, and (probably) no. More than a decade

after its introduction, its adoption can be described politely as lackluster. But Intel

is still committed to producing Itanium-based systems, although they are limited to

high-end servers.

So let's bring it back to the original challenges that motivated the creation of

IA-64. Itanium was designed to solve the many deficiencies in the IA-32 architec-

ture. Given that it was not widely adopted, how did Intel address these deficien-

cies? As we will see in Chap. 8, the key to marching the IA-32 line forward was

not in retooling the ISA, but rather in embracing parallel computing, through chip

multiprocessor designs. For more information about the Itanium 2 and its micro-

architecture, see McNairy and Soltis (2003) and Rusu et al. (2004).

Search WWH ::

Custom Search

Home