Information Technology Reference
In-Depth Information
Head Pointer
Head Pointer
Tail Pointer
Tail Pointer
Full
Full
Disabled
Empty
Empty
Limit Pointer
FIGURE 4.15: Circular IQ. Adapted from [ 80 ].
The instruction queue in their proposal is organized as a circular FIFO buffer. Head
and tail pointers point to the head entry (the oldest instruction) and the tail entry (the newest
instruction). The space between the tail and the head entries is the full part since it contains valid
instructions, either ready to issue or waiting for their operands (Figure 4.15). The space between
the head and the tail entries is the empty part. Similarly to the IQ discussed previously, CAM
fields in each entry match results returning from the functional units. When an instruction
matches both its operands, it becomes ready to issue.
Instruction Queue Collapsing : Upon issue of an instruction to the execution units, the
corresponding instruction queue entry is freed. This creates holes in the full part of the
IQ (see Figure 4.15). In some designs, such holes are filled by moving up all valid entries.
This is called collapsing and it is done because it can simplify the selection (scheduling) of
ready instructions. One example of this technique is the instruction queue of the Alpha
[ 134 ]. However, collapsing consumes power because of all the data movement it entails.
Folegnani and Gonzalez do not use it since holes in the full part are also included in their
power-saving schemes.
The key observation Folegnani and Gonzalez make about such an instruction queue is
that empty entries need not participate in the tag match at all. Furthermore, ready operands
also do not need to participate in the tag match. It is fairly straightforward to disable an entry's
CAM tag comparison by gating the tagline precharge transistor with the entry's ready flag or
valid flag. This immediately reduces the comparison activity, making it proportional to the
number of valid waiting entries in the IQ. According to their statistics for an 128-entry IQ and
for representative SPEC2000 benchmarks, on average, there are only 58 entries in full area of
the IQ and 26 of those are already empty. This means that about 89% of the wake-up energy
(CAM matching) can be saved. Similarly to the estimates of Buyuktosunoglu et al., Folegnani
and Gonzalez also attribute the bulk (63%) of the IQ power to the associative matching.
 
Search WWH ::




Custom Search