Hardware Reference
In-Depth Information
Basic Hardware Primitives
The key ability we require to implement synchronization in a multiprocessor is a set of hard-
ware primitives with the ability to atomically read and modify a memory location. Without
such a capability, the cost of building basic synchronization primitives will be too high and
will increase as the processor count increases. There are a number of alternative formulations
of the basic hardware primitives, all of which provide the ability to atomically read and modi-
fy a location, together with some way to tell if the read and write were performed atomically.
These hardware primitives are the basic building blocks that are used to build a wide variety
of user-level synchronization operations, including things such as locks and barriers. In gener-
all architects do not expect users to employ the basic hardware primitives, but instead expect
that the primitives will be used by system programmers to build a synchronization library,
a process that is often complex and tricky. Let's start with one such hardware primitive and
show how it can be used to build some basic synchronization operations.
One typical operation for building synchronization operations is the atomic exchange , which
interchanges a value in a register for a value in memory. To see how to use this to build a basic
synchronization operation, assume that we want to build a simple lock where the value 0 is
used to indicate that the lock is free and 1 is used to indicate that the lock is unavailable. A pro-
cessor tries to set the lock by doing an exchange of 1, which is in a register, with the memory
address corresponding to the lock. The value returned from the exchange instruction is 1 if
some other processor had already claimed access and 0 otherwise. In the latter case, the value
is also changed to 1, preventing any competing exchange from also retrieving a 0.
For example, consider two processors that each try to do the exchange simultaneously: This
race is broken since exactly one of the processors will perform the exchange first, returning 0,
and the second processor will return 1 when it does the exchange. The key to using the ex-
change (or swap) primitive to implement synchronization is that the operation is atomic: The
exchange is indivisible, and two simultaneous exchanges will be ordered by the write serial-
ization mechanisms. It is impossible for two processors trying to set the synchronization vari-
able in this manner to both think they have simultaneously set the variable.
There are a number of other atomic primitives that can be used to implement synchroniz-
ation. They all have the key property that they read and update a memory value in such a
manner that we can tell whether or not the two operations executed atomically. One opera-
tion, present in many older multiprocessors, is test-and-set , which tests a value and sets it if the
value passes the test. For example, we could define an operation that tested for 0 and set the
value to 1, which can be used in a fashion similar to how we used atomic exchange. Another
atomic synchronization primitive is fetch-and-increment : It returns the value of a memory loc-
ation and atomically increments it. By using the value 0 to indicate that the synchronization
variable is unclaimed, we can use fetch-and-increment, just as we used exchange. There are
other uses of operations like fetch-and-increment, which we will see shortly.
Implementing a single atomic memory operation introduces some challenges, since it re-
quires both a memory read and a write in a single, uninterruptible instruction. This require-
ment complicates the implementation of coherence, since the hardware cannot allow any other
operations between the read and the write, and yet must not deadlock.
An alternative is to have a pair of instructions where the second instruction returns a value
from which it can be deduced whether the pair of instructions was executed as if the instruc-
tions were atomic. The pair of instructions is effectively atomic if it appears as if all other op-
erations executed by any processor occurred before or after the pair. Thus, when an instruc-
tion pair is effectively atomic, no other processor can change the value between the instruction
pair.
Search WWH ::




Custom Search