Information Technology Reference
In-Depth Information
Table 10.2 The development of the world's most powerful parallel computer according to the
Top500 List [2]
Time
System model
CPU type
CPUs
GFLOPS
June 1993
TMC CM5
SuperSPARC I 32MHz
1024
131
June 1994
Intel Paragon
Intel 80860 50MHz
3680
184
June 1995
Fujitsu VPP
Fujitsu 105MHz
140
235.79
June 1996
Hitachi SR2201
HARP-1E 150MHz
1024
307.2
June 1997
Intel Paragon
Pentium Pro 200MHz
7264
1453
June 1998
Intel Paragon
Pentium Pro 200MHz
9152
1830.4
June 1999
Intel Paragon
Pentium Pro 333MHz
9472
3154
June 2000
Intel Paragon
Pentium Pro 333MHz
9632
3207
June 2001
IBM SP
POWER3 375MHz
8192
12288
June 2002
NEC SX6
NEC 1000MHz
5120
40960
June 2005
IBM BlueGene/L
PowerPC 440 700MHz
65536
183500
June 2006
IBM BlueGene/L
PowerPC 440 700MHz
131072
367000
June 2008
IBM BladeCenter Cluster
PowerXCell 8i 3200MHz
122400
1375776
June 2009
IBM BladeCenter Cluster
PowerXCell 8i 3200MHz
129600
1456704
Exercise 10.3. Suppose now that the P processors are not equal in computing
speed, where half of them have double the speed of the others. How should n equally
expensive computational tasks be divided?
˘
Exercise 10.4. We said in Sect. 10.2.2 that parallelizing the composite trapezoidal
rule can be done by two approaches. In the first approach the n 1 inner points are
divided among P homogeneous processors using (10.8)and(10.9). In the second
approach the integral domain Œa; b is divided into P segments, so that each proces-
sor applies the numerical integration rule on its assigned subdomain ŒX p ;X pC1 .
Derive the formula for calculating X p , so that the n intervals are fairly divided
among the processors. Also find out the exact difference (summed over all P proces-
sors) in the numbers of function evaluations and floating-point arithmetic operations
between the two approaches.
˘
Exercise 10.5. Derive the parallel algorithm of the composite Simpson's rule (6.4)
using two approaches that are similar to the case of the composite trapezoidal
rule.
˘
Exercise 10.6. If S ? is a desired speedup value, how can we choose the number
of processors P according to Amdahl's law (10.17)? Prove also that, if ˛>0 ,
the parallel efficiency is a monotonically decreasing function of P as the result of
Amdahl's law. If ? is the acceptable threshold of parallel efficiency, what can be
the maximum value of P ?
˘
Exercise 10.7. We recall that Gustafson-Barsis's law (10.18)uses ˛ P to denote the
fraction of serial computing time in T.P/ . What is the corresponding fraction ˛ of
non-parallelizable computing time in T.1/ ? If we know that the current speedup is
S.P/ D S 1 associated with T.P/ and the present value of ˛ P . By how much should
 
Search WWH ::




Custom Search