Solutions - Bayesian Networks in R: With Applications in Systems Biology

Biology Reference

In-Depth Information

+

main = paste("tabu(..., iss = ", iss, ")",

+

sep = "")

+

sub = paste(narcs(bn), "arcs")

+

graphviz.plot(bn, main = main, sub = sub)

+}

(b) The length of the tabu list does have a significant impact on structure learning,

for two reasons. First of all, it does increase the number of network structures

that are by tabu , and therefore structure learning requires more time. This is

especially relevant for score functions that are expensive to compute, such as

BGe. Furthermore, the score of network structure consistently increases with

the length of the tabu list; getting stuck into a local maximum becomes more

and more unlikely as the tabu list grows.

> par(mfrow = c(1, 5))

> for (n in c(10, 15, 20, 50, 100)) {

+

bn = tabu(alarm, score = "bde", tabu = n)

+

bde = score(bn, alarm, type = "bde")

+

main = paste("tabu(..., tabu = ", n, ")",

+

sep = "")

+

sub = paste(ntests(bn), "steps, score", bde)

+

graphviz.plot(bn, main = main, sub = sub)

+}

(c) The BIC score is asymptotically equivalent to BDe, so the networks learned

using these two scores become more similar as sample size increases. At small

sample sizes, BIC penalizes dense networks more heavily than BDe and there-

fore results in much fewer arcs being included and in much lower execution

time.

> par(mfrow = c(2, 6))

> for (n in c(100, 200, 500, 1000, 2000, 5000)) {

+

bn.bde = hc(alarm[1:n, ], score = "bde")

+

bn.bic = hc(alarm[1:n, ], score = "bic")

+

bde = score(bn.bde, alarm, type = "bde")

+

bic = score(bn.bic, alarm, type = "bic")

+

main = paste("BDe, sample size", n)

+

sub = paste(ntests(bn.bde), "steps, score", bde)

+

graphviz.plot(bn.bde, main = main, sub = sub)

+

main = paste("BIC, sample size", n)

+

sub = paste(ntests(bn.bic), "steps, score", bic)

+

graphviz.plot(bn.bic, main = main, sub = sub)

+}

2.6 Consider the observational data set from Sachs et al. ( 2005 )usedin

Sect. 2.5.1 (the original data set, not the discretized one).

(a) Evaluate the networks learned by hill-climbing with BIC and BGe using

cross-validation and the log-likelihood loss function.

Bayesian Networks in R: With Applications in Systems Biology

Search WWH ::

Custom Search

Home