Biomedical Engineering Reference
In-Depth Information
A negative voxel index value indicates a ''finish message'' signaling that the
sender process has finished its work. In this case, there is no companion message
containing the correlation value. So, the master simply decrements the count and
continues to the next iteration of the loop. When the master receives the ''finish
messages'' from all the compute processes, the count becomes zero. Finally, the
while loop of the master ends, and it closes the output file and proceeds to call
MPI Barrier .
MPI Barrier and MPI Allreduce
The compute processes also call MPI Barrier after completing their work.
MPI Barrier is a collective operation that takes a MPI communicator as its
only argument. This is set to the default communicator MPI COMM WORLD .All
the processes calling MPI Barrier wait in the function until all the processes in
the group have made a call to MPI Barrier . This ensures that a process can begin
the post-barrier part of its work only after all the other processes have finished the
pre-barrier part of their work. This functionality is very useful for synchronizing
processes. Using this, different phases of computations can be cleanly separated
from one another to ensure that MPI messages sent from a later phase do not
interfere with messages sent during the earlier phases of computation.
Finally, all the processes call MPI Allreduce to compute the sum of the
absolute values of correlations between every voxel pair. While computing the
correlations, each process maintains a running sum of the absolute value of cor-
relations that it computes (line 69). In the end, these running sum values stored
at all the compute processes, need to be added to compute the final sum. This is
done efficiently using the collective function MPI Allreduce . This function takes
a number of data elements from each process as its input, applies an associative
global reduction operator and distributes the results to all the processes. For ex-
ample, if the input at each of the processes is an array of 10 integers and the global
reduction operation is the sum , then the output will be an array of 10 integers con-
taining, element-wise, the sum of the corresponding integers at all the processes.
The first argument to this function is a pointer to the buffer containing the
input data. The second argument is a pointer to the buffer where the output is to
be stored. To save memory and copying overhead, many applications desire the
output at the same place as the input. To achieve this, MPI IN PLACE can be used
as the first argument. In this case, the second argument is a pointer to the input and
output buffer. The third and the fourth arguments to this function are the number
of elements and the data-type. The fifth argument is the reduction operation to be
performed. The sixth argument is the communicator.
The set of predefined reduction operators in MPI is quite rich. These include:
MPI MAX for maximum operation, MPI MIN for minimum operation, MPI SUM
for sum operation, MPI PROD for product operation, MPI LAND for logical-and
operation, MPI BAND for bit-wise-and operation, MPI LOR for logical-or operation,
MPI BOR for bit-wise-or operation, MPI LXOR for logical xor operation, MPI BXOR
for bit-wise xor operation, MPI MAXLOC for the location of the maximum value,
and MPI MINLOC for the location of the minimum value. A reduction operator
outside this set is rarely needed. However, MPI allows users to create new reduction
Search WWH ::




Custom Search