Hardware Reference
In-Depth Information
and neither option is clearly preferable. Immediately updating the entry in main
memory is referred to as write through . This approach is generally simpler to im-
plement and more reliable, since the memory is always up to date—helpful, for ex-
ample, if an error occurs and it is necessary to recover the state of the memory.
Unfortunately, it also usually requires more write traffic to memory, so more
sophisticated implementations tend to employ the alternative, known as write
deferred ,or write back .
A related problem must be addressed for writes: what if a write occurs to a lo-
cation that is not currently cached? Should the data be brought into the cache, or
just written out to memory? Again, neither answer is always best. Most designs
that defer writes to memory tend to bring data into the cache on a write miss, a
technique known as write allocation . Most designs employing write through, on
the other hand, tend not to allocate an entry on a write because this option compli-
cates an otherwise simple design. Write allocation wins only if there are repeated
writes to the same or different words within a cache line.
Cache performance is critical to system performance because the gap between
CPU speed and memory speed is very large. Consequently, research on better
caching strategies is still a hot topic (Sanchez and Kozyrakis, 2011, and Gaur et. al,
2011).
4.5.2 Branch Prediction
Modern computers are highly pipelined. The pipeline of Fig. 4-36 has seven
stages; high-end computers sometimes have 10-stage pipelines or even more.
Pipelining works best on linear code, so the fetch unit can just read in consecutive
words from memory and send them off to the decode unit in advance of their being
needed.
The only minor problem with this wonderful model is that it is not the slightest
bit realistic. Programs are not linear code sequences. They are full of branch in-
structions. Consider the simple statements of Fig. 4-40(a). A variable, i , is com-
pared to 0 (probably the most common test in practice). Depending on the result,
another variable, k , gets assigned one of two possible values.
if (i == 0)
CMP i,0
; compare i to 0
k = 1;
BNE Else
; branch to Else if not equal
else
Then: MOV k,1
; move 1 to k
k = 2;
BR Next
; unconditional branch to Next
Else:
MOV k,2
; move 2 to k
Next:
(a)
(b)
Figure 4-40. (a) A program fragment. (b) Its translation to a generic assembly
language.
 
 
Search WWH ::




Custom Search