Biology Reference
In-Depth Information
to take advantage of parallel computing. However, it is important to note that the
degree to which an algorithm can leverage parallel processing depends on the nature
of the problem it is trying to address. Some problems are
embarrassingly parallel
,
that is, they can be split in such a way that each part rarely or never has to communi-
cate with the other parts. Other problems cannot be fully parallelized, because their
parts have to communicate periodically with each other to synchronize their state.
If frequent synchronizations are required we speak of
fine-grained parallelism
,and
of
coarse-grained parallelism
if synchronizations are only needed a few times over
a long period of time. Finally, some problems are
inherently sequential
and cannot
be parallelized at all.
5.2 Parallel Programming in
R
The
R
interpreter can only execute one command at a time. The only functions that
can take advantage of multiple processors are the linear algebra routines provided
by the
Basic Linear Algebra Subprograms
(BLAS) library. To this end,
R
must be
compiled against a third-party, multi-threaded implementation of the BLAS library
such as the one provided by Intel. However, performance improvements are limited
to algorithms making heavy use of these routines.
This situation has led to the development of several contributed packages dealing
with parallel computing; an overview of these efforts is provided in
Schmidberger
et al.
(
2009
).
bnlearn
is designed to work with:
The
snow
package (
Tierney et al.
,
2008
),
1
which provides support for simple
parallel computing using the
master-slave
model.
snow
spawns a configurable
number of
R
processes in background (the
slave processes
). The user can then
copy data back and forth and send them commands from the
R
console he is
workingon(the
master process
). The communication between those processes
is managed using either standard TCP sockets or the mechanisms provided by
the
Rmpi
and
rpvm
packages. These processes are said to form a
cluster
and
can run on different computers.
•
•
The
Rmpi
package (
Yu
,
2010
), which is an
R
interface to the C libraries im-
plementing the de facto
Message-Passing Interface
(MPI) standard, a language-
independent communications protocol designed to program parallel computers.
•
The
rpvm
package (
Li and Rossini
,
2010
), which is an
R
interface to the
Parallel
Virtual Machine
(PVM) software. PVM is designed to allow a network of hetero-
geneous Unix and Windows machines to be used as a single distributed parallel
processor.
•
The
rsprng
package (
Li
,
2010
), which provides independent random number
generators to the slaves spawned by
snow
.
1
Since version 2.14, the
R
base distribution includes a revised copy of
snow
in the
parallel
package.
Search WWH ::
Custom Search