Civil Engineering Reference
In-Depth Information
therefore cheap. However for really intensive computations, it is likely that an amalgamation
of vector and parallel hardware is ideal.
1.5 Parallel processors
In this concept (of which there are many variants) there are several physically distinct
processors (e.g. a few expensive ones or a lot of cheaper ones). Programs and/or data can
reside on different processors which have to communicate with one another.
There are two foreseeable ways in which this communication can be organised (rather
like memory management which was described earlier). Either the programmer takes control
of the communication process, using a programming feature called message passing ,oritis
done automatically, without user control. The second strategy is of course appealing and has
led to the development of “High Performance Fortran” or HPF (e.g. see Koelbel et al ., 1995)
which has been designed as an extension to Fortran 95. “Directives”, which are treated as
comments by non-HPF compilers, are inserted into the Fortran 95 programs and allow data
to be mapped onto parallel processors together with the specification of the operations on
such data which can be carried out in parallel. The attractive feature of this strategy is that
programs are “portable”, that is they can be easily transferred from computer to computer.
One would also anticipate that manufacturers could produce compilers which made best
use of their specific type of hardware. At the time of writing, the first implementations of
HPF are just being reported.
An alternative to HPF, involving roughly the same level of user intervention, can be used
on specific hardware. Manufacturers provide “directives” which can be inserted by users
in programs and implemented by the compiler to parallelise sections of the code (usually
associated with DO -loops). Smith (2000) shows that this approach can be quite effective
for up to a modest number of parallel processors (say 10). However such programs are not
portable to other machines.
A further alternative is to use OpenMP, a portable set of directives but limited to a
class of parallel machines with so-called “shared memory”. Although the codes in this topic
have been rather successfully adapted for parallel processing using OpenMP (Pettipher and
Smith, 1997) the most popular strategy applicable equally to “shared memory” and “dis-
tributed memory” systems is described in Chapter 12. The programs therein have been
run successfully on clusters of PCs communicating via Ethernet and on shared and dis-
tributed memory supercomputers with their much more expensive communication systems.
This strategy of message passing under programmer control is realised by MPI (“message
passing interface”) which is a de facto standard thereby ensuring portability (MPI Web
reference, 2003).
1.6 BLAS libraries
As was mentioned earlier, programs implementing the Finite Element Method make inten-
sive use of matrix or array structures. For example a study of any of the programs in
the succeeding chapters will reveal repeated use of the subroutine MATMUL described in
Search WWH ::




Custom Search