Hardware Reference
In-Depth Information
will use standard synchronization libraries and will write synchronized programs, making the
choice of a weak consistency model invisible to the programmer and yielding higher perform-
ance.
An alternative viewpoint, which we discuss more extensively in the next section, argues that
with speculation much of the performance advantage of relaxed consistency models can be
obtained with sequential or processor consistency.
A key part of this argument in favor of relaxed consistency revolves around the role of the
compiler and its ability to optimize memory access to potentially shared variables; this topic
is also discussed in Section 5.7 .
5.7 Crosscutting Issues
Because multiprocessors redefine many system characteristics (e.g., performance assessment,
memory latency, and the importance of scalability), they introduce interesting design prob-
lems that cut across the spectrum, affecting both hardware and software. In this section, we
give several examples related to the issue of memory consistency. We then examine the per-
formance gained when multithreading is added to multiprocessing.
Compiler Optimization And The Consistency Model
Another reason for defining a model for memory consistency is to specify the range of legal
compiler optimizations that can be performed on shared data. In explicitly parallel programs,
unless the synchronization points are clearly defined and the programs are synchronized, the
compiler cannot interchange a read and a write of two different shared data items because
such transformations might affect the semantics of the program. This prevents even relatively
simple optimizations, such as register allocation of shared data, because such a process usually
interchanges reads and writes. In implicitly parallelized programs—for example, those writ-
ten in High Performance FORTRAN (HPF)—programs must be synchronized and the syn-
chronization points are known, so this issue does not arise. Whether compilers can get signiic-
ant advantage from more relaxed consistency models remains an open question, both from a
research viewpoint and from a practical viewpoint, where the lack of uniform models is likely
to retard progress on deploying compilers.
Using Speculation To Hide Latency In Strict Consistency
Models
As we saw in Chapter 3 , speculation can be used to hide memory latency. It can also be used
to hide latency arising from a strict consistency model, giving much of the benefit of a re-
laxed memory model. The key idea is for the processor to use dynamic scheduling to reorder
memory references, leting them possibly execute out of order. Executing the memory refer-
ences out of order may generate violations of sequential consistency, which might affect the
execution of the program. This possibility is avoided by using the delayed commit feature of
a speculative processor. Assume the coherency protocol is based on invalidation. If the pro-
cessor receives an invalidation for a memory reference before the memory reference is com-
mitted, the processor uses speculation recovery to back out of the computation and restart
with the memory reference whose address was invalidated.
If the reordering of memory requests by the processor yields an execution order that could
result in an outcome that differs from what would have been seen under sequential consisten-
 
Search WWH ::




Custom Search