Biology Reference
In-Depth Information
2.3.4 Structure Learning
So far, we have analyzed the marks data set using pre-specified network structures.
While this approach may be feasible in some settings, such as when expert
knowledge is available, it is far more common for the network structure to be learned
from the data. For this reason, we will now focus on the various options available in
R for structure learning.
Consider, for instance, the network structure learned for the marks data with the
Grow-Shrink implementation from bnlearn (Fig. 2.4 ).
> bn.gs = gs(marks)
> bn.gs
Bayesian network learned via Constraint-based
methods
model:
[STAT][ANL|STAT][ALG|ANL:STAT][VECT|ALG]
[MECH|VECT:ALG]
nodes:
5
arcs:
6
undirected arcs:
0
directed arcs:
6
average markov blanket size:
2.40
average neighbourhood size:
2.40
average branching factor:
1.20
learning algorithm:
Grow-Shrink
conditional independence test:
Pearson's Linear Correlation
alpha threshold: 0.05
tests used in the learning procedure: 32
optimized: TRUE
The default conditional independence test is the Student's t test introduced in
Eq. 2.13 , because of its exact distribution, with a threshold of
05 for the type I
error. Note that constraint-based algorithms are largely self-correcting for multiplic-
ity ( Aliferis et al. , 2010a , b ); no explicit multiplicity correction such as family-wise
error rate (FWER) or false discovery rate (FDR) is needed to choose a suitable
threshold. Small values of
α =
0
.
α
,e.g.
α [
0
.
01
,
0
.
05
]
work well for networks with up
to hundreds of variables.
All the IAMB algorithms return the same network structure as gs , which in turn
is identical to the DAG in Fig. 2.2 . Even changing the conditional independence test
to Fisher's Z or performing a permutation test (by setting the test argument to zf
or mc-cor , respectively) does not make any difference.
Search WWH ::




Custom Search