Parallel Monte Carlo Simulation of HIV Molecular Evolution in Response to Immune Surveillance - Parallel Computing for Bioinformatics and Computational Biology

Biomedical Engineering Reference

In-Depth Information

rate is approximately 10 − 5 per nucleotide per replication cycle [19], and the recombi-

nation rate is two orders of magnitude higher, approximately 10 − 3 per nucleotide per

replication cycle for virus infecting T-cells [20]. The length of the HIV-1 replication

cycle is estimated at between 1 and 3 days [21].

2.4 PARALLELIZATION WITH MPI

The simulation model described earlier will benefit from parallelization by simply

having replicate simulations run simultaneously. Replicates have identical starting

conditions, but because of the stochastic nature of mutation, recombination, selection,

and other events, the states of replicate simulations diverge. This problem is referred

to as embarrassingly, or naturally, parallel because of the ease with which a parallel

programming model is implemented. More formally, this is an example of a single

program multiple data (SPMD) parallel programming model [1]. MPI is particularly

well suited to this programming model and was used to run replicate simulations in

parallel. MPI provides a set of calls (an interface) to a library of subroutines (Fortran)

or functions (C) that control communication between processors. Descriptions of the

MPI-1 and MPI-2 standards can be found at http: // www-unix.mcs.anl.gov / mpi / .A

good primer on MPI is the topic Using MPI [22].

2.4.1 Initialization and Termination of MPI

The basic parallelization strategy applied here is to run each replicate simulation on a

separate processor and afterward gather results from each process, summarize these,

and write a single output file to disk. Figure 2.6 gives the Fortran 90 pseudocode in the

main program for this approach. These instructions are executed on each processor.

Near the beginning of the code is a C preprocessor directive that begins with #

include . This directive identifies the MPI header file for Fortran, mpif.h , which

contains definitions of variables and constants used by MPI. For example, the default

communicator, MPI_COMM_WORLD , which defines the communication context and

set of all processes used, is defined in this header file. This directive must be used in

every program and subprogram that makes MPI calls. Invoking preprocessing usually

requires setting a compiler switch or identifying files for automatic preprocessing

with the suffix “.F” or “.F90,” as opposed to “.f” or “.f90” for files that do not require

preprocessing.

Then, after some variable declarations, MPI is initialized with a call to the MPI

subroutine MPI_INIT . This must be the first MPI call. The variable istat is used

to store the error status of the MPI call. Following this statement, two calls are made,

one to MPI_COMM_SIZE to query the number of processors being used ( npes ) and

the other to MPI_COMM_RANK to identify the processor number on which execution

is taking place ( mype ). The next statement identifies the replicate number being run

on the processor. Processors are counted from zero, so the replicate number is mype

+1 . This information is usedwhen summarizing the results from replicate simulations.

Search WWH ::

Custom Search

Home