Geoscience Reference
In-Depth Information
3.5 WHEN NOT TO USE PARALLEL COMPUTING
It may seem odd in a chapter promoting the use of parallel computers to include a section about
when not to use them. However, it will save a lot of time if, before rushing headlong into parallel
programming, analysis or modelling, you stop and think if your problem really requires you to put
so much effort into it. Some questions to ask are as follows: How often is this code to be run in
production? How long does it take to run at the moment? If the answers to these questions are not
often and not too long, then don't bother to think parallel! It will take at least twice as long and prob-
ably a lot longer to write, test and debug the parallel version of your code than it would to write (or
optimise) some serial code. If you only need to run the model once or twice, then even runtimes of
a week or two are still quicker than building a parallel version of your code. However, you will also
need to consider the memory requirements of your problem; if they are many times the real memory
available on your serial machine, then parallel may well be the right route. If you have a program
that takes a month or more to run, then parallelism may be right, and if you need to run a program
or model many thousands or millions of times, then again parallel is the correct path to consider.
It is also necessary to consider the size of the task that you want to parallelise. If the problem
is a small kernel which cannot be easily subdivided and it depends on the previous result, then it
is unlikely to parallelise, since you cannot split the job between processors and you cannot start
computing the next part of the problem until the current step is complete. Problems of this type are,
however, rare in geography.
3.6 WHEN TO USE PARALLEL COMPUTING AND HOW
First, reread the section earlier, which will attempt to talk you out of parallelisation. If you can
pass all the aforementioned tests or are persistent in the face of them, then now is the time to think
parallel. First, you need to decide what sort of parallelism you want to use, and then consider the
language you want to use. Some of these choices will be dictated by what sort of parallel machine
you have access to and which parallel languages it supports.
If you are certain that you need to go parallel, then the next question is as follows: Do you have
access to a parallel computer? Even if at first glance the answer to this seems to be no, do not be dis-
heartened as it is possible with some extra software to turn a room (or building) full of workstations
into a virtual parallel computer. There are free versions of MPI available that provide exactly the
same functionality on a network of workstations as MPI found on large supercomputers. Obviously,
this is not as fast as a dedicated parallel machine but it is a lot cheaper. This approach can also be
used for development work on parallel software since you will only be allocated a limited amount
of time on a large machine for your work, which can often be used up very quickly during devel-
opment. The OpenMP parallel processing API (application programming interface) has also been
developed for writing parallel programs in a number of different programming languages on vari-
ous platforms from desktop computers to supercomputers (Chapman 2008). OpenMP is currently
in version 4.0 (http://openmp.org/wp/).
Parallel problems can be broken down into several types. The first is fine-grained parallelism,
which refers to subtasks that must communicate many times per second, while the second one is
referred to as coarse-grained parallelism, where subtasks communicate less frequently than many
times per second. The final group is known as trivially parallel or embarrassingly parallel, and as
the name suggests, they are the easiest to handle since these programs have subtasks that rarely com-
municate or never have to communicate. The example of producing nine babies given earlier is an
example from the real world. There is no need for the women involved to be in constant communica-
tion with each other or even to be in the same place. For certain types of computing problem, this
is also the case; for instance, if you want to carry out an error simulation on a model that involves
running the model 100 times and comparing the spread of the results from slightly different starting
points, then the easiest way to make this run faster is to run each model on a separate processor and
Search WWH ::




Custom Search