Hardware Reference
In-Depth Information
Comparison Of A GPU And A MIMD With Multimedia SIMD
A group of Intel researchers published a paper [Lee et al. 2010] comparing a quad-core Intel i7
(see Chapter 3 ) with multimedia SIMD extensions to the previous generation GPU, the Tesla
GTX 280. Figure 4.27 lists the characteristics of the two systems. Both products were purchased
in Fall 2009. The Core i7 is in Intel's 45-nanometer semiconductor technology while the GPU
is in TSMC's 65-nanometer technology. Although it might have been more fair to have a com-
parison by a neutral party or by both interested parties, the purpose of this section is not to
determine how much faster one product is than another, but to try to understand the relative
value of features of these two contrasting architecture styles.
FIGURE 4.27 Intel Core i7-960, NVIDIA GTX 280, and GTX 480 specifications . The right-
most columns show the ratios of GTX 280 and GTX 480 to Core i7. For single-precision SIMD
FLOPS on the GTX 280, the higher speed (933) comes from a very rare case of dual issuing
of fused multiply-add and multiply. More reasonable is 622 for single fused multiply-adds. Al-
though the case study is between the 280 and i7, we include the 480 to show its relationship
to the 280 since it is described in this chapter. Note that these memory bandwidths are higher
than in Figure 4.28 because these are DRAM pin bandwidths and those in Figure 4.28 are at
the processors as measured by a benchmark program. (From Table 2 in Lee et al. [2010].)
The rooflines of the Core i7 920 and GTX 280 in Figure 4.28 illustrate the differences in the
computers. The 920 has a slower clock rate than the 960 (2.66 GHz versus 3.2 GHz), but the rest
of the system is the same. Not only does the GTX 280 have much higher memory bandwidth
and double-precision floating-point performance, but also its double-precision ridge point is
considerably to the left. As mentioned above, it is much easier to hit peak computational per-
formance the further the ridge point of the roofline is to the left. The double-precision ridge
point is 0.6 for the GTX 280 versus 2.6 for the Core i7. For single-precision performance, the
 
Search WWH ::




Custom Search