THE MICROARCHITECTURE LEVEL - Structured Computer Organization

Hardware Reference

In-Depth Information

and neither option is clearly preferable. Immediately updating the entry in main

memory is referred to as write through . This approach is generally simpler to im-

plement and more reliable, since the memory is always up to date—helpful, for ex-

ample, if an error occurs and it is necessary to recover the state of the memory.

Unfortunately, it also usually requires more write traffic to memory, so more

sophisticated implementations tend to employ the alternative, known as write

deferred ,or write back .

A related problem must be addressed for writes: what if a write occurs to a lo-

cation that is not currently cached? Should the data be brought into the cache, or

just written out to memory? Again, neither answer is always best. Most designs

that defer writes to memory tend to bring data into the cache on a write miss, a

technique known as write allocation . Most designs employing write through, on

the other hand, tend not to allocate an entry on a write because this option compli-

cates an otherwise simple design. Write allocation wins only if there are repeated

writes to the same or different words within a cache line.

Cache performance is critical to system performance because the gap between

CPU speed and memory speed is very large. Consequently, research on better

caching strategies is still a hot topic (Sanchez and Kozyrakis, 2011, and Gaur et. al,

2011).

4.5.2 Branch Prediction

Modern computers are highly pipelined. The pipeline of Fig. 4-36 has seven

stages; high-end computers sometimes have 10-stage pipelines or even more.

Pipelining works best on linear code, so the fetch unit can just read in consecutive

words from memory and send them off to the decode unit in advance of their being

needed.

The only minor problem with this wonderful model is that it is not the slightest

bit realistic. Programs are not linear code sequences. They are full of branch in-

structions. Consider the simple statements of Fig. 4-40(a). A variable, i , is com-

pared to 0 (probably the most common test in practice). Depending on the result,

another variable, k , gets assigned one of two possible values.

if (i == 0)

CMP i,0

; compare i to 0

k = 1;

BNE Else

; branch to Else if not equal

else

Then: MOV k,1

; move 1 to k

k = 2;

BR Next

; unconditional branch to Next

Else:

MOV k,2

; move 2 to k

(b)

Figure 4-40. (a) A program fragment. (b) Its translation to a generic assembly

language.

Structured Computer Organization

Search WWH ::

Custom Search

Home