A Glimpse of Parallel Computing - Elements of Scientific Computing

Information Technology Reference

In-Depth Information

s p D i start ;p Cn p 1

f.x i /;

(10.7)

i Di start ;p

where, for this example, assuming equally expensive function evaluations and

homogeneous processors, we have

n p D n 1

C 1

if p< mod .n 1; P /;

(10.8)

else ;

and

i start ;p D 1 C p n 1

C min .p; mod .n 1; P //:

(10.9)

Example 10.3. Suppose n D 200 and P D 3 ; then we have from (10.8) n 0 D 67 ,

n 1 D n 2 D 66 . Following (10.9), we have i start ;0 D 1 , i start ;1 D 68 ,and i start ;2 D 134 .

More specifically, processor 0 is responsible for inner points from x 1 until x 67 ,pro-

cessor 1 works for inner points from x 68 until x 133 , and processor 2 is assigned to

inner points from x 134 until x 199 .

So parallel computing in connection with the composite trapezoidal rule arises

from the fact that each processor can independently compute its s p following

(10.7). However, when the processors have finished the local computation of s p ,

an additional computation of the following form is needed to complete the entire

work:

Z b

2 .f .a/ C f.b//C P X

pD0

A :

f.x/dx h

s p

(10.10)

Note that the local results s p are so far only available on the different processors.

Therefore, before we can calculate P P 1

pD0

s p , the processors have to somehow share

all the s p values. There are two approaches. In the first approach, we designate one

processor as the master and consider all the other processors as slaves. All the slave

processors pass their s p value to the master, which then computes the final result

using (10.10). In case the slaves also want to know the final result, the master can

send it to all the slaves. In the second approach, all processors have an equal role.

The first action on each processor is to pass its own s p value to all the other P 1

processors, and get in return P 1 different s p values. The second action on each

processor is then to independently carry out the computation of (10.10).

It is worth noting that the processors have to collaborate in calculating P P 1

pD0

s p .

In the first approach above, the communication is of the form all-to-one, followed

possibly by a subsequent one-to-all communication. In the second approach, the

communication is of the form all-to-all. No matter which approach, we can imagine

that a considerable amount of communication programming is needed for summing

up the different s p values possessed by the P processors. Luckily, standard paral-

lel programming libraries and languages have efficient built-in implementations for

Elements of Scientific Computing

Search WWH ::

Custom Search

Home