Biomedical Engineering Reference
In-Depth Information
A negative voxel index value indicates a ''finish message'' signaling that the
sender process has finished its work. In this case, there is no companion message
containing the correlation value. So, the master simply decrements the
count
and
continues to the next iteration of the loop. When the master receives the ''finish
messages'' from all the compute processes, the
count
becomes zero. Finally, the
while loop of the master ends, and it closes the output file and proceeds to call
MPI Barrier
.
MPI Barrier and MPI Allreduce
The compute processes also call
MPI Barrier
after completing their work.
MPI Barrier
is a collective operation that takes a MPI communicator as its
only argument. This is set to the default communicator
MPI COMM WORLD
.All
the processes calling
MPI Barrier
wait in the function until all the processes in
the group have made a call to
MPI Barrier
. This ensures that a process can begin
the post-barrier part of its work only after all the other processes have finished the
pre-barrier part of their work. This functionality is very useful for synchronizing
processes. Using this, different phases of computations can be cleanly separated
from one another to ensure that MPI messages sent from a later phase do not
interfere with messages sent during the earlier phases of computation.
Finally, all the processes call
MPI Allreduce
to compute the sum of the
absolute values of correlations between every voxel pair. While computing the
correlations, each process maintains a running sum of the absolute value of cor-
relations that it computes (line 69). In the end, these running sum values stored
at all the compute processes, need to be added to compute the final sum. This is
done efficiently using the collective function
MPI Allreduce
. This function takes
a number of data elements from each process as its input, applies an associative
global
reduction
operator and distributes the results to all the processes. For ex-
ample, if the input at each of the processes is an array of 10 integers and the global
reduction operation is the
sum
, then the output will be an array of 10 integers con-
taining, element-wise, the sum of the corresponding integers at all the processes.
The first argument to this function is a pointer to the buffer containing the
input data. The second argument is a pointer to the buffer where the output is to
be stored. To save memory and copying overhead, many applications desire the
output at the same place as the input. To achieve this,
MPI IN PLACE
can be used
as the first argument. In this case, the second argument is a pointer to the input and
output buffer. The third and the fourth arguments to this function are the number
of elements and the data-type. The fifth argument is the reduction operation to be
performed. The sixth argument is the communicator.
The set of predefined reduction operators in MPI is quite rich. These include:
MPI MAX
for maximum operation,
MPI MIN
for minimum operation,
MPI SUM
for sum operation,
MPI PROD
for product operation,
MPI LAND
for logical-and
operation,
MPI BAND
for bit-wise-and operation,
MPI LOR
for logical-or operation,
MPI BOR
for bit-wise-or operation,
MPI LXOR
for logical xor operation,
MPI BXOR
for bit-wise xor operation,
MPI MAXLOC
for the location of the maximum value,
and
MPI MINLOC
for the location of the minimum value. A reduction operator
outside this set is rarely needed. However, MPI allows users to create new reduction
Search WWH ::
Custom Search