Biology Reference
In-Depth Information
We will now illustrate the use of this set of packages, which will then be
used throughout this chapter to show how parallel computing applies to Bayesian
networks.
Thefirststepistoloadthe snow and the rsprng packages.
> library(snow)
> library(rsprng)
The Rmpi and rpvm packages are loaded by snow as needed. Subsequently, we
need to spawn the slave processes and initialize the cluster with the makeCluster
function.
> cl = makeCluster(2, type = "MPI")
Loading required package: Rmpi
The first argument of makeCluster specifies the number of slave processes which
will be spawned, which is usually between 2 and the number of processes that can
run concurrently without overcommitting any hardware resource. The second ar-
gument specifies the communication mechanism used between the master and the
slave processes; possible values are "SOCK" to use sockets (the default), "MPI" to
use Rmpi ,and "PVM" to use rpvm .
Once the slave processes have been spawned, we can initialize their random num-
ber generators.
> clusterSetupSPRNG(cl)
The setup of the cluster is now completed, and we can start using it to speed up our
computations. For example, we can compute simultaneously the means of all the
variables of the marks data we used in Chap. 2 ,
> parApply(cl, X = marks, MARGIN = 2, FUN = mean)
MECH VECT ALG ANL STAT
38.95455 50.59091 50.60227 46.68182 42.30682
getting the same result as the call to mean we would have used to compute them in
a sequential way.
> mean(marks)
MECH VECT ALG ANL STAT
38.95455 50.59091 50.60227 46.68182 42.30682
The parApply function, along with parLapply and parSapply ,represents
the most user-friendly way to set up embarrassingly parallel computations. These
functions are the parallel versions of apply , lapply ,and sapply and work in
exactly the same way from the user's point of view.
Problems which are not embarrassingly parallel, or which cannot be divided in
identical parts, can be tackled using a combination of clusterExport (to copy
thedatatotheslave R processes) and clusterEvalQ (to make the slave processes
execute arbitrary R commands). For instance, we may be interested in comparing
Pearson's and Spearman's correlation matrices for the marks data, and we may
want to estimate these matrices in parallel. To achieve that, we can first export the
marks data to the slave processes,
> clusterExport(cl, list("marks"))
Search WWH ::




Custom Search