Hardware Reference
In-Depth Information
Effectiveness Of Simultaneous Multithreading On Superscalar
Processors
A key question is, How much performance can be gained by implementing SMT? When this
question was explored in 2000-2001, researchers assumed that dynamic superscalars would
get much wider in the next five years, supporting six to eight issues per clock with speculative
dynamic scheduling, many simultaneous loads and stores, large primary caches, and four to
eight contexts with simultaneous issue and retirement from multiple contexts. No processor
has goten close to this level.
As a result, simulation research results that showed gains for multiprogrammed workloads
of two or more times are unrealistic. In practice, the existing implementations of SMT ofer
only two to four contexts with fetching and issue from only one, and up to four issues per
clock. The result is that the gain from SMT is also more modest.
For example, in the Pentium 4 Extreme, as implemented in HP-Compaq servers, the use of
SMT yields a performance improvement of 1.01 when running the SPECintRate benchmark
and about 1.07 when running the SPECfpRate benchmark. Tuck and Tullsen [2003] reported
that, on the SPLASH parallel benchmarks, they found single-core multithreaded speedups
ranging from 1.02 to 1.67, with an average speedup of about 1.22.
With the availability of recent extensive and insightful measurements done by Esmaeilza-
deh et al. [2011] , we can look at the performance and energy benefits of using SMT in a singlei7
i7 core using a set of multithreaded applications. The benchmarks we use consist of a col-
lection of parallel scientific applications and a set of multithreaded Java programs from the
DaCapo and SPEC Java suite, as summarized in Figure 3.34 . The Intel i7 supports SMT with
two threads. Figure 3.35 shows the performance ratio and the energy efficiency ratio of the
these benchmarks run on one core of the i7 with SMT turned of and on. (We plot the energy
efficiency ratio, which is the inverse of energy consumption, so that, like speedup, a higher ra-
tio is beter.)
Search WWH ::




Custom Search