Preliminaries: Computer Strategies - Programming the Finite Element Method

Civil Engineering Reference

In-Depth Information

therefore cheap. However for really intensive computations, it is likely that an amalgamation

of vector and parallel hardware is ideal.

1.5 Parallel processors

In this concept (of which there are many variants) there are several physically distinct

processors (e.g. a few expensive ones or a lot of cheaper ones). Programs and/or data can

reside on different processors which have to communicate with one another.

There are two foreseeable ways in which this communication can be organised (rather

like memory management which was described earlier). Either the programmer takes control

of the communication process, using a programming feature called message passing ,oritis

done automatically, without user control. The second strategy is of course appealing and has

led to the development of “High Performance Fortran” or HPF (e.g. see Koelbel et al ., 1995)

which has been designed as an extension to Fortran 95. “Directives”, which are treated as

comments by non-HPF compilers, are inserted into the Fortran 95 programs and allow data

to be mapped onto parallel processors together with the specification of the operations on

such data which can be carried out in parallel. The attractive feature of this strategy is that

programs are “portable”, that is they can be easily transferred from computer to computer.

One would also anticipate that manufacturers could produce compilers which made best

use of their specific type of hardware. At the time of writing, the first implementations of

HPF are just being reported.

An alternative to HPF, involving roughly the same level of user intervention, can be used

on specific hardware. Manufacturers provide “directives” which can be inserted by users

in programs and implemented by the compiler to parallelise sections of the code (usually

associated with DO -loops). Smith (2000) shows that this approach can be quite effective

for up to a modest number of parallel processors (say 10). However such programs are not

portable to other machines.

A further alternative is to use OpenMP, a portable set of directives but limited to a

class of parallel machines with so-called “shared memory”. Although the codes in this topic

have been rather successfully adapted for parallel processing using OpenMP (Pettipher and

Smith, 1997) the most popular strategy applicable equally to “shared memory” and “dis-

tributed memory” systems is described in Chapter 12. The programs therein have been

run successfully on clusters of PCs communicating via Ethernet and on shared and dis-

tributed memory supercomputers with their much more expensive communication systems.

This strategy of message passing under programmer control is realised by MPI (“message

passing interface”) which is a de facto standard thereby ensuring portability (MPI Web

reference, 2003).

1.6 BLAS libraries

As was mentioned earlier, programs implementing the Finite Element Method make inten-

sive use of matrix or array structures. For example a study of any of the programs in

the succeeding chapters will reveal repeated use of the subroutine MATMUL described in

Search WWH ::

Custom Search

Home