Information Technology Reference
In-Depth Information
s p D i start ;p Cn p 1
X
f.x i /;
(10.7)
i Di start ;p
where, for this example, assuming equally expensive function evaluations and
homogeneous processors, we have
n p D n 1
P
C 1
if p< mod .n 1; P /;
(10.8)
0
else ;
and
i start ;p D 1 C p n 1
P
C min .p; mod .n 1; P //:
(10.9)
Example 10.3. Suppose n D 200 and P D 3 ; then we have from (10.8) n 0 D 67 ,
n 1 D n 2 D 66 . Following (10.9), we have i start ;0 D 1 , i start ;1 D 68 ,and i start ;2 D 134 .
More specifically, processor 0 is responsible for inner points from x 1 until x 67 ,pro-
cessor 1 works for inner points from x 68 until x 133 , and processor 2 is assigned to
inner points from x 134 until x 199 .
So parallel computing in connection with the composite trapezoidal rule arises
from the fact that each processor can independently compute its s p following
(10.7). However, when the processors have finished the local computation of s p ,
an additional computation of the following form is needed to complete the entire
work:
0
1
Z b
2 .f .a/ C f.b//C P X
pD0
1
@
A :
f.x/dx h
s p
(10.10)
a
Note that the local results s p are so far only available on the different processors.
Therefore, before we can calculate P P 1
pD0
s p , the processors have to somehow share
all the s p values. There are two approaches. In the first approach, we designate one
processor as the master and consider all the other processors as slaves. All the slave
processors pass their s p value to the master, which then computes the final result
using (10.10). In case the slaves also want to know the final result, the master can
send it to all the slaves. In the second approach, all processors have an equal role.
The first action on each processor is to pass its own s p value to all the other P 1
processors, and get in return P 1 different s p values. The second action on each
processor is then to independently carry out the computation of (10.10).
It is worth noting that the processors have to collaborate in calculating P P 1
pD0
s p .
In the first approach above, the communication is of the form all-to-one, followed
possibly by a subsequent one-to-all communication. In the second approach, the
communication is of the form all-to-all. No matter which approach, we can imagine
that a considerable amount of communication programming is needed for summing
up the different s p values possessed by the P processors. Luckily, standard paral-
lel programming libraries and languages have efficient built-in implementations for
Search WWH ::




Custom Search