Parallel Computing for Bayesian Networks - Bayesian Networks in R: With Applications in Systems Biology

Biology Reference

In-Depth Information

Exercises

5.1. Using the hailfinder data set included in bnlearn and a snow cluster with

at least 2 slave processes:

(a) Compute the number of levels and the most common level for each node.

(b) Split the samples among the slaves and identify which nodes have at least one

level with less than 5 observations in that particular subsample.

(c) Compute the entropy of each variable in hailfinder ,definedas

)= ∑ − p log p ,

H

(

p

where p is the relative frequency of each level of the variable.

5.2. Consider the alarm data set included in bnlearn .

(a) Learn the structure of the network using Inter-IAMB and a shrinkage test with

alpha = 0.01 and measure the execution time of the algorithm.

(b) Does a 2-node cluster provide a greater performance improvement than just

switching from optimized = FALSE to optimized = TRUE ?

(c) Is that still true when a Monte Carlo permutation test is used?

5.3. Consider again the alarm data set from Exercise 5.2 ,anda snow cluster with

at least 2 nodes.

(a) Use nonparametric bootstrap to determine the distribution of the number of arcs

present in a network structure learned with hc .

(b) How does that distribution change when bootstrap samples have size m = 100 ?

(c) Compare the distribution of the number of score comparisons for m = 100 and

m = 5000 .

5.4. Implement a parallel version of the model averaging performed using hc with

random starting networks in Sect. 2.5.1 .

Search WWH ::

Custom Search

Home