Information Technology Reference
In-Depth Information
Research has explored the benefits of per-core versus per-chip DVFS for CMPs. For
example, on a four-core CMP in which DVFS was being employed to avoid thermal emergen-
cies (rather than simply to save power), a per-core approach had 2.5
better throughput than
a per-chip approach [ 67 ]. This is because the per-chip approach must scale down the entire
chip's ( V , f ) when even a single core is nearing overheating. With per-core control, only the
core with a hot spot must scale ( V , f ) downwards; other cores can maintain high speed unless
they themselves encounter thermal problems.
While a multi-core processor can be used to run independent programs for throughput,
its promise for single-program performance lies in thread-level parallelism. Managing power
in a multicore when running parallel (multi-threaded) programs is currently a highly active
area of research. Many research groups are tackling the problem, considering both symmetric
architectures which replicate the same core and asymmetric architectures that feature a variety
of cores with different power/performance characteristics [ 146 ]. 5 Independent DVFS for each
core [ 15 ], a mixture chip-wide DVFS and core allocation [ 153 ], or work-steering strategies at
the program level in heterogeneous architectures [ 146 , 170 ] are considered.
×
3.5 HARDWARE-LEVEL DVFS
The main premise in much of the DVFS work is that a system, a task, or a program can
be slowed down with disproportionally small impact on its performance (or the perception of
performance for interactive tasks), while at the same time obtaining significant savings in power
consumption by voltage scaling. This can only be achieved by intelligently reducing frequency
to remove slack : idle time in the system, slack in tasks with deadlines, or instruction slack due
to memory accesses in memory-bound program phases. A similar idea can be applied at the
hardware level. Ernst, et al. proposed a DVFS variation intended to remove slack in the timing
of the hardware itself. Their approach is called Razor [ 73 ].
The driving motivation is to scale the supply voltage as low as possible for a given
frequency while still maintaining correct operation. What prevents scaling the voltage below a
critical level for a given frequency is the built-in margins in a process technology. For a given
frequency, a voltage level is allowed which is safely above the lowest voltage level needed for the
worst-case process and environment variability in the design. In other words, the relation between
voltage and frequency is such that it guarantees correct operation with significant margin from
the worst-case scenario.
This, however, diminishes the value of DVFS since the useful voltage range for DVFS
shrinks with each new process technology. Going below the critical voltage level ( subcritical
voltage ) for a given frequency invites trouble: timing faults. Faults, however, are unlikely to
5 Similarly to the architecture described in Section 3.4.2 but allowing multiple cores to be active at the same time.
Search WWH ::




Custom Search