Thread-Level Parallelism - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

will use standard synchronization libraries and will write synchronized programs, making the

choice of a weak consistency model invisible to the programmer and yielding higher perform-

ance.

An alternative viewpoint, which we discuss more extensively in the next section, argues that

with speculation much of the performance advantage of relaxed consistency models can be

obtained with sequential or processor consistency.

A key part of this argument in favor of relaxed consistency revolves around the role of the

compiler and its ability to optimize memory access to potentially shared variables; this topic

is also discussed in Section 5.7 .

5.7 Crosscutting Issues

Because multiprocessors redefine many system characteristics (e.g., performance assessment,

memory latency, and the importance of scalability), they introduce interesting design prob-

lems that cut across the spectrum, affecting both hardware and software. In this section, we

give several examples related to the issue of memory consistency. We then examine the per-

formance gained when multithreading is added to multiprocessing.

Compiler Optimization And The Consistency Model

Another reason for defining a model for memory consistency is to specify the range of legal

compiler optimizations that can be performed on shared data. In explicitly parallel programs,

unless the synchronization points are clearly defined and the programs are synchronized, the

compiler cannot interchange a read and a write of two different shared data items because

such transformations might affect the semantics of the program. This prevents even relatively

simple optimizations, such as register allocation of shared data, because such a process usually

interchanges reads and writes. In implicitly parallelized programs—for example, those writ-

ten in High Performance FORTRAN (HPF)—programs must be synchronized and the syn-

chronization points are known, so this issue does not arise. Whether compilers can get signiic-

ant advantage from more relaxed consistency models remains an open question, both from a

research viewpoint and from a practical viewpoint, where the lack of uniform models is likely

to retard progress on deploying compilers.

Using Speculation To Hide Latency In Strict Consistency

Models

As we saw in Chapter 3 , speculation can be used to hide memory latency. It can also be used

to hide latency arising from a strict consistency model, giving much of the benefit of a re-

laxed memory model. The key idea is for the processor to use dynamic scheduling to reorder

memory references, leting them possibly execute out of order. Executing the memory refer-

ences out of order may generate violations of sequential consistency, which might affect the

execution of the program. This possibility is avoided by using the delayed commit feature of

a speculative processor. Assume the coherency protocol is based on invalidation. If the pro-

cessor receives an invalidation for a memory reference before the memory reference is com-

mitted, the processor uses speculation recovery to back out of the computation and restart

with the memory reference whose address was invalidated.

If the reordering of memory requests by the processor yields an execution order that could

result in an outcome that differs from what would have been seen under sequential consisten-

Search WWH ::

Custom Search

Home