Biology Reference
In-Depth Information
The EM algorithm assigned the first 52 students (with the exception of number
45) to belong to group A and the remainder to group B . If we consider once more
the discretized marks dmarks we created in Sect. 2.2.6 and include the latent
variable when learning the structure of the network, we obtain a network structure
completely different from the ones in Fig. 2.6 .
> bn.LAT = hc(cbind(dmarks, LAT = latent))
> bn.LAT
Bayesian network learned via Score-based methods
model:
[MECH][ANL][LAT|MECH:ANL][VECT|LAT][ALG|LAT]
[STAT|LAT]
nodes:
6
arcs:
5
undirected arcs:
0
directed arcs:
5
average markov blanket size:
2.00
average neighbourhood size:
1.67
average branching factor:
0.83
learning algorithm:
Hill-Climbing
score:
Bayesian Information Criterion
penalization coefficient: 2.238668
tests used in the learning procedure: 40
optimized:
TRUE
The three network structures learned above are shown in Fig. 2.7 ; the one
including the latent variable, bn.LAT , agrees with the network structure reported in
Edwards ( 2000 ). We can clearly see that any causal relationship we could infer with-
out taking LAT into account would be potentially spurious. In fact, we could even
question the assumption that the data are a random sample from a single population
and have not been manipulated in some way beforehand.
2.5 Applications to Gene Expression Profiles
Static Bayesian networks provide a versatile tool for the analysis of many kinds
of biological data, including (but not limited to) single-nucleotide polymorphism
(SNP) data and gene expression profiles. Following the work of Friedman et al.
( 2000 ), the expression level or the allele frequency of each gene is associated with
one node. In addition, we can include in the network additional nodes denoting other
Search WWH ::




Custom Search