Biology Reference
In-Depth Information
Fig. 5.1 Parallel implementation of the Grow-Shrink algorithm present in bnlearn
3. Given the Markov blankets and the neighborhoods, the v-structures centered on
a particular node (i.e., the one with the converging arcs) can again be identified
in parallel. As in the previous step, the consistency of the neighborhoods must be
checked and any departure from symmetry must be fixed beforehand.
Furthermore, the final step of the Grow-Shrink algorithm, in which the directions
of compelled arcs are learned, also displays a fine-grained parallelism. The order
in which arcs are considered in that step depends on the topology of the graph;
undirected arcs whose orientations would result in the greatest number of cycles are
considered first. That number can be computed in parallel for each arc, at the cost
of introducing some overhead.
We will now examine the practical implications of parallelizing a constraint-
based learning algorithm. To that end, we will use the hailfinder data set in-
cluded in bnlearn , which is generated from the reference network of the same name.
Hailfinder is a Bayesian network designed by Abramson et al. ( 1996 ) to forecast
severe summer hail in northeastern Colorado. It contains 56 variables and 20,000
observations and is large enough to properly highlight the advantages and the limi-
tations of parallel computing.
Consider a simple cluster with two slave processes.
> data(hailfinder)
> cl = makeCluster(2, type = "MPI")
2 slaves are spawned successfully. 0 failed.
> res = gs(hailfinder, cluster = cl)
> unlist(clusterEvalQ(cl, .test.counter))
[1] 2698 3765
> .test.counter
[1] 4
> stopCluster(cl)
 
Search WWH ::




Custom Search