Using Voltage and Frequency Adjustments toManage Dynamic Power - Computer Architecture Techniques for Power-Efficiency

Information Technology Reference

In-Depth Information

alone. In either scenario, research questions arise regarding whether to make offline (e.g.,

compile-time) decisions about DVFS settings, versus online, reactive, approaches.

(3) What is the hardware granularity at which voltage and frequency can be controlled?

This question is closely related to the question above. The bulk of the DVFS research has

focused on cases in which the entire processor core operates at the same ( V , f ) setting but

is asynchronous to the “outside” work, such as main memory. In such scenarios, the main

goal of DVFS is to capitalize on cases in which the processor's workload is heavily memory-

bound. In these cases, the processor is often stalled waiting on memory, so reducing its supply

voltage and clock frequency will reduce power and energy without having significant impact on

performance.

Other work has considered cases in which multiple clock domains may exist on a chip.

These so-called MCD scenarios might either be multiple clock domains within a single pro-

cessor core [ 199 , 200 , 216 , 227 , 228 ] or chip multiprocessors in which each on-chip processor

core has a different voltage/clock domain [ 67 ]. This dimension is explored in Section 3.4.

(4) How do the implementation characteristics of the DVFS approach being used affect the

strategies to employ? Some of the implementation characteristics for DVFS can have significant

influence on the strategies an architect might choose, and the likely payoffs they might offer.

For example, what is the delay required to engage a new setting of ( V , f )? (And, can the

processor continue to execute during the transition from one ( V , f ) pair to another?) If the

delay is very short, then simple reactive techniques may offer high payoff. If the delay is quite

long, however, then techniques based on more intelligent or offline analysis might make more

sense.

(5) How does the DVFS landscape change when considering parallel applications on multiple-

core processors? When considering one, single-threaded application in isolation, one need only

consider the possible asynchrony between compute and memory. In other regards, reducing

the clock frequency proportionately degrades the performance. In a parallel scenario, however,

reducing the clock frequency of one thread may impact other dependent threads that are waiting

for a result to be produced. Thus, when considering DVFS for parallel applications, some notion

of critical path analysis may be helpful.

Another similar question regards whether continuous settings of ( V , f ) pairs are possible,

or whether these values can only be changed in fixed, discrete steps. If only discrete step-wise

adjustments of ( V , f ) are possible, then the optimization space becomes difficult to navigate

because it is “non-convex.” As a result, simple online techniques might have difficulty finding

global optima, and more complicated or offline analysis again becomes warranted.

Because DVFS is available for experimentation on real systems [ 111 , 112 , 2 ], and because

it offers such high leverage in power/energy savings, it has been widely studied in a variety of

communities. Our discussion only touches on some of the key observations from the architectural

Search WWH ::

Custom Search

Home