Information Technology Reference
In-Depth Information
Issue Width
Execution
C ORE
EV4
2
In-order
EV5
4
In-order
EV6
6
Out-of-Order
EV8-
8
Out-of-Order
Area (mm 2 )
Peak Power
(Watts)
Typical Power
(Watts)
C ORE
EV4
2.87
4.97
3.73
EV5
5.04
9.83
6.88
EV6
24.5
17.80
10.68
EV8-
236
92.88
46.44
FIGURE 4.35: Estimated values for power and area for the cores used by Kumar et al. [ 147 ]. From
[ 147 ]. Copyright 2003 IEEE.
Section 4.2 contains more details and techniques for handling Idle-unit activity involved in this
case.
Microarchitecture level : At a higher level, the microarchitecture level, Rochecouste, Pokam,
and Seznec proposed a work steering approach for idle-width activity [ 192 ]. Their proposal is
a 4-issue, width-partitioned microarchitecture (WPM) processor comprised of two clusters: one
normal, 64-bit wide, 2-issue cluster and another, narrow-width, 16-bit, 2-issue cluster. Instruc-
tions are steered to the appropriate cluster according to the predicted width of their operands
and result. Because this is a statically partitioned microarchitecture, it is complexity-effective,
consumes less power, and requires less area than other approaches for idle-width activity. The
drawback is that it is only balanced for specific workloads that have an even mix of narrow
and wide operands. If the operand width characteristics of the workload differ significantly the
performance of the width-partitioned microarchitecture could be compromised.
Analogous approaches at this level could be devised for many of the techniques (and
the types of excess activity) discussed in this chapter by offering optimized and “unoptimized”
versions of the same structures side by side and dynamically choosing among them according
to program and run-time needs.
Processor core level : At this level, Kumar, Farkas, Jouppi, Ranganathan, and Tullsen pro-
posed a multi-core architecture using a variety of cores implementing the same Instruction-Set
Architecture (ISA) [ 147 ]. As basis for their study they use the Alpha ISA and its implemen-
tations, EV4, EV5, EV6 [ 134 ], and a single-threaded version of the EV8 (denoted EV8- ), for
which there are plenty of published results for both power and performance.
Figure 4.35 shows the estimated values for the area, peak, and typical power consumption
of the four cores examined. To derive these estimates the authors use published results from
 
Search WWH ::




Custom Search