Biology Reference
In-Depth Information
2.2.
Using the network structures created in Exercise
2.1
for the
asia
data set,
produce the following plots with
graphviz.plot
:
(a) A plot of the CPDAG of the equivalence class in which the arcs belonging to a
v-structure are highlighted (either with a different color or using a thicker line
width).
(b) Fill the nodes with different colors according to their role in the diagnostic pro-
cess: causes (“visit to Asia” and “smoking”), effects (“Tuberculosis,” “lung can-
cer,” and “bronchitis”) and the diagnosis proper (“chest X-ray,” “dyspnea,” and
“either tuberculosis or lung cancer/bronchitis”).
(c) Explore different layouts by changing the
layout
and
shape
arguments.
2.3.
Consider the
marks
data set analyzed in Sect.
2.3
.
(a) Discretize the data using a quantile transform and different numbers of intervals
(say, from 2 to 5). How does the network structure learned from the resulting
data sets change as the number of intervals increases?
(b) Repeat the discretization using interval discretization using up to five intervals,
and compare the resulting networks with the ones obtained previously with
quantile discretization.
(c) Does Hartemink's discretization algorithm perform better than either quantile or
interval discretization? How does its behavior depend on the number of initial
breaks?
2.4.
The ALARM network (
Beinlich et al.
,
1989
) is a Bayesian network designed
to provide an alarm message system for patients hospitalized in intensive care units
(ICU). Since ALARM is commonly used as a benchmark in literature, a synthetic
data set of 5,000 observations generated from this network is available from
bnlearn
as
alarm
.
(a) Create a
bn
object for the “true” structure of the network using the model string
provided in its manual page.
(b) Compare the networks learned with different constraint-based algorithms with
the true one, both in terms of structural differences and using either BIC or BDe.
(c) The overall performance of constraint-based algorithms suggests that the asymp-
totic
2
conditional independence tests may not be appropriate for analyzing
alarm
. Are permutation or shrinkage tests better choices?
(d) How are the above learning strategies affected by changes to
alpha
?
χ
2.5.
Consider again the
alarm
network used in Exercise
2.4
.
(a) Learn its structure with hill-climbing and tabu search, using the posterior den-
sity BDe as a score function. How does the network structure change with the
imaginary sample size
iss
?
(b) Does the length of the tabu list have a significant impact on the network struc-
tures learned with
tabu
?
(c) How does the BIC score compare with BDe at different sample sizes in terms of
structure and score of the learned network?
Search WWH ::
Custom Search