Hardware Reference
In-Depth Information
FIGURE 5.6 A write invalidate, cache coherence protocol for a private write-back cache
showing the states and state transitions for each block in the cache . The cache states
are shown in circles, with any access permitted by the local processor without a state trans-
ition shown in parentheses under the name of the state. The stimulus causing a state change
is shown on the transition arcs in regular type, and any bus actions generated as part of the
state transition are shown on the transition arc in bold. The stimulus actions apply to a block in
the private cache, not to a specific address in the cache. Hence, a read miss to a block in the
shared state is a miss for that cache block but for a different address. The left side of the dia-
gram shows state transitions based on actions of the processor associated with this cache;
the right side shows transitions based on operations on the bus. A read miss in the exclusive
or shared state and a write miss in the exclusive state occur when the address requested by
the processor does not match the address in the local cache block. Such a miss is a standard
cache replacement miss. An attempt to write a block in the shared state generates an invalid-
ate. Whenever a bus transaction occurs, all private caches that contain the cache block speci-
fied in the bus transaction take the action dictated by the right half of the diagram. The pro-
tocol assumes that memory (or a shared cache) provides data on a read miss for a block that
is clean in all local caches. In actual implementations, these two sets of state diagrams are
combined. In practice, there are many subtle variations on invalidate protocols, including the
introduction of the exclusive unmodified state, as to whether a processor or memory provides
data on a miss. In a multicore chip, the shared cache (usually L3, but sometimes L2) acts as
the equivalent of memory, and the bus is the bus between the private caches of each core
and the shared cache, which in turn interfaces to the memory.
All of the states in this cache protocol would be needed in a uniprocessor cache, where
they would correspond to the invalid, valid (and clean), and dirty states. Most of the state
changes indicated by arcs in the left half of Figure 5.6 would be needed in a write-back unipro-
cessor cache, with the exception being the invalidate on a write hit to a shared block. The state
changes represented by the arcs in the right half of Figure 5.6 are needed only for coherence
and would not appear at all in a uniprocessor cache controller.
As mentioned earlier, there is only one finite-state machine per cache, with stimuli coming
either from the atached processor or from the bus. Figure 5.7 shows how the state transitions
 
Search WWH ::




Custom Search