Biology Reference
In-Depth Information
We will now illustrate the use of this set of packages, which will then be
used throughout this chapter to show how parallel computing applies to Bayesian
networks.
Thefirststepistoloadthe
snow
and the
rsprng
packages.
> library(snow)
> library(rsprng)
The
Rmpi
and
rpvm
packages are loaded by
snow
as needed. Subsequently, we
need to spawn the slave processes and initialize the cluster with the
makeCluster
function.
> cl = makeCluster(2, type = "MPI")
Loading required package: Rmpi
The first argument of
makeCluster
specifies the number of slave processes which
will be spawned, which is usually between 2 and the number of processes that can
run concurrently without overcommitting any hardware resource. The second ar-
gument specifies the communication mechanism used between the master and the
slave processes; possible values are
"SOCK"
to use sockets (the default),
"MPI"
to
use
Rmpi
,and
"PVM"
to use
rpvm
.
Once the slave processes have been spawned, we can initialize their random num-
ber generators.
> clusterSetupSPRNG(cl)
The setup of the cluster is now completed, and we can start using it to speed up our
computations. For example, we can compute simultaneously the means of all the
variables of the
marks
data we used in Chap.
2
,
> parApply(cl, X = marks, MARGIN = 2, FUN = mean)
MECH VECT ALG ANL STAT
38.95455 50.59091 50.60227 46.68182 42.30682
getting the same result as the call to
mean
we would have used to compute them in
a sequential way.
> mean(marks)
MECH VECT ALG ANL STAT
38.95455 50.59091 50.60227 46.68182 42.30682
The
parApply
function, along with
parLapply
and
parSapply
,represents
the most user-friendly way to set up embarrassingly parallel computations. These
functions are the parallel versions of
apply
,
lapply
,and
sapply
and work in
exactly the same way from the user's point of view.
Problems which are not embarrassingly parallel, or which cannot be divided in
identical parts, can be tackled using a combination of
clusterExport
(to copy
thedatatotheslave
R
processes) and
clusterEvalQ
(to make the slave processes
execute arbitrary
R
commands). For instance, we may be interested in comparing
Pearson's and Spearman's correlation matrices for the
marks
data, and we may
want to estimate these matrices in parallel. To achieve that, we can first export the
marks
data to the slave processes,
> clusterExport(cl, list("marks"))
Search WWH ::
Custom Search