Biology Reference
In-Depth Information
2.2. Using the network structures created in Exercise 2.1 for the asia data set,
produce the following plots with graphviz.plot :
(a) A plot of the CPDAG of the equivalence class in which the arcs belonging to a
v-structure are highlighted (either with a different color or using a thicker line
width).
(b) Fill the nodes with different colors according to their role in the diagnostic pro-
cess: causes (“visit to Asia” and “smoking”), effects (“Tuberculosis,” “lung can-
cer,” and “bronchitis”) and the diagnosis proper (“chest X-ray,” “dyspnea,” and
“either tuberculosis or lung cancer/bronchitis”).
(c) Explore different layouts by changing the layout and shape arguments.
2.3. Consider the marks data set analyzed in Sect. 2.3 .
(a) Discretize the data using a quantile transform and different numbers of intervals
(say, from 2 to 5). How does the network structure learned from the resulting
data sets change as the number of intervals increases?
(b) Repeat the discretization using interval discretization using up to five intervals,
and compare the resulting networks with the ones obtained previously with
quantile discretization.
(c) Does Hartemink's discretization algorithm perform better than either quantile or
interval discretization? How does its behavior depend on the number of initial
breaks?
2.4. The ALARM network ( Beinlich et al. , 1989 ) is a Bayesian network designed
to provide an alarm message system for patients hospitalized in intensive care units
(ICU). Since ALARM is commonly used as a benchmark in literature, a synthetic
data set of 5,000 observations generated from this network is available from bnlearn
as alarm .
(a) Create a bn object for the “true” structure of the network using the model string
provided in its manual page.
(b) Compare the networks learned with different constraint-based algorithms with
the true one, both in terms of structural differences and using either BIC or BDe.
(c) The overall performance of constraint-based algorithms suggests that the asymp-
totic
2 conditional independence tests may not be appropriate for analyzing
alarm . Are permutation or shrinkage tests better choices?
(d) How are the above learning strategies affected by changes to alpha ?
χ
2.5. Consider again the alarm network used in Exercise 2.4 .
(a) Learn its structure with hill-climbing and tabu search, using the posterior den-
sity BDe as a score function. How does the network structure change with the
imaginary sample size iss ?
(b) Does the length of the tabu list have a significant impact on the network struc-
tures learned with tabu ?
(c) How does the BIC score compare with BDe at different sample sizes in terms of
structure and score of the learned network?
Search WWH ::




Custom Search