Information Technology Reference
In-Depth Information
if (present_IPC) < D factor * last_IPC
increase size;
else if (number of active segments < threshold_1)
decrease size;
else if (number of active segments < threshold_2)
retain current size;
else increase size;
FIGURE 4.14: Algorithm to resize an Issue Queue. Adapted from [ 42 ].
Not surprisingly, the power of the CAM part is linear to the number of entries enabled.
If a 32-entry IQ is divided in to four 8-entry chunks, disabling three out of four chunks yields
power savings of 75%. The CAM part consumes ten times more energy than the SRAM part,
thus the reduction of the CAM energy provides the bulk of the benefit.
4.6.2 Readiness Feedback Control
Perhaps, more interesting than the design of the partitionable IQ, is the algorithm to control
its size. The idea is to adjust the IQ size based on the “activity” of its entries. Although the
authors do not discuss it in detail, the high-level scheme they propose bases its decisions on the
average number of active IQ chunks within a time window.
An IQ chunk is regarded as active if at least half of its entries have their ready flag set—
i.e., an active chunk has a significant percentage of its entries ready to issue. On every cycle,
the number of active chunks is accumulated in a register. At the end of a timing window this
register is compared to two empirically chosen thresholds and a decision is taken on whether
to disable chunks, lave as is, or enable more. 5
This decision-making scheme is wrapped with a safety mechanism, that reverses the last
(downsizing) decision if it had a negative effect on IPC. The threshold that triggers the safety
mechanism is given as a degradation factor D on IPC: if the new IPC is D times the old IPC
(where D is less than 1) then the last sizing decision is reversed. The full decision scheme in
pseudocode is shown in Figure 4.14.
Using this scheme in a simulated 4-issue processor with a 32-entry issue queue, and on
some of the integer SPEC2000 benchmarks, the power savings for the IQ are 35% (on average)
with an IPC degradation of just over 4% [ 42 ].
4.6.3 Occupancy Feedback Control
Ponomarev, Kucuk, and Ghose examine the more general problem of reducing power for
the three main structures that collectively comprise the instruction scheduling mechanism:
5 The authors imply that the thresholds change according to the number of enabled chunks but no further detail is
given [ 42 ].
Search WWH ::




Custom Search