Information Technology Reference
In-Depth Information
an approach will avoid using #pragma omp single , but will have to repeatedly
spawn new threads and then terminate them, giving rise to more overhead in thread
creation.
OpenMP programming is quite simple, because the standard consists only of
a small number of directives and clauses. Much of the parallelization work, such
as work division and task scheduling, is hidden from the user. On the one side,
this programming philosophy provides great user friendliness. On the other side,
however, the user has very little control over which thread accesses which part of
the shared memory. This limit is normally bad with respect to the performance on
a modern processor architecture, which relies heavily on the use of caches. To fully
utilize a cache, it is important that the data items that are read from memory to
cache should be reused as much as possible. Data locality - either when data items
that are located close by in memory participate in one computing operation, or when
one data item is repeatedly used in consecutive operations - gives rise to good cache
usage, but is hard to enforce in an OpenMP program.
10.3.2
MPI Programming
MPI is a standard specification for message-passing programming on distributed-
memory computers. On shared-memory systems it is also common to have MPI
installations that are implemented using efficient shared-memory intrinsics for pass-
ing messages. Thus MPI is the most portable approach to parallel programming.
There are two parts of MPI: The first part contains more than 120 functions
and constitutes the core of MPI [26], whereas the second part contains advanced
extensions [14]. Here, we only intend to give a brief introduction to MPI pro-
gramming. For more in-depth learning, we refer the reader to the standard MPI
textbooks [15, 23].
A message in the context of MPI is simply an array of data elements of a prede-
fined data type. The logical execution units in an MPI program are called processes ,
which are initiated by the MPI Init call at the start of a program and terminated
by the MPI Finalize call at the end. The number of started MPI processes is usu-
ally the same as the number of available processors, although one processor can
in principle be assigned with several MPI processes. All the started MPI processes
constitute a so-called global MPI communicator named MPI Comm World , which can
be subdivided into smaller communicators if needed. Almost all the MPI functions
require an input argument of type MPI communicator, which together with a process
rank (between 0 and P 1 ) is used to specify a particular MPI process.
Compared with OpenMP, MPI programming is clearly more difficult, not only
because of the large number of available functions, but also because one MPI call
normally requires many arguments. For example, the simplest function in C for
sending a message is
 
Search WWH ::




Custom Search