Biology Reference
In-Depth Information
a class of structures, V-shaped structures (e.g., Mv : X 1 /
X 2 ;
where the sums are taken over all loci used to infer the
relationship. If a causal or reactive relationship is inferred,
then the prior probability is scaled as
X 2 ), that have no Markov equivalent. In such cases
it is possible to infer causal relationships based on corre-
lation data alone. Because there are more parameters to
estimate in the Mv model than in the M1, M2, or M3
models, there is a large penalty in the BIC score for the Mv
model. Therefore, in practice, a large sample size is needed
to differentiate the Mv model from the M1, M2, or M3
models.
X 3 /
2 P i
/
B j A
;
;
A
B
l i
p ð A / B Þ¼
P A
l i Þ
Finally, if the causal/reactive relationship between
genes A and B cannot be determined from the first two
sources, the complexity of the eQTL signature for each
gene can be taken into consideration. Genes with a simpler,
albeit stronger eQTL signature (i.e., a small number of
eQTL that explain the genetic variance component for the
gene, with a significant proportion of the overall variance
explained by the genetic effects) can be considered more
likely to be causal than compared with more complex and
possibly weaker eQTL signatures (i.e., a larger number of
eQTL explaining the genetic variance component for the
gene, with less of the overall variance explained by the
genetic effects). The structure prior that gene A is a parent
of gene B can then be taken to be
/
B j A
;
B
;
l i þ p ð B
/
A j A
;
B
;
Integrating Genetic Data as a Structure prior
to Enhancing Causal Inference in the
Bayesian Network Reconstruction Process
In general, Bayesian networks can only be solved to
Markov equivalent structures, so that it is often not
possible to determine the causal direction of a link
between two nodes even though Bayesian networks are
directed graphs. However, the Bayesian network recon-
struction algorithm can take advantage of genetic data to
break the symmetry among nodes in the network that
lead to Markov equivalent structures, thereby providing
a way to infer causal directions in the network in an
unambiguous fashion [24] . The reconstruction algorithm
can be modified to incorporate genetic data as prior
evidence that two quantitative traits may be causally
related based on previously described causality test [24] .
The genetic priors can be constructed from three basic
sources. First, gene expression traits associated with
DNA variants that are coincident with the gene's physical
location(referredtoascis-acting expression quantitative
trait loci or cis eQTL) [43] areallowedtobeparentnodes
of genes with coincident trans eQTLs (the gene in this
case does not physically reside at the genetic locus of
interest), p ð cis
1 þ n ð B Þ
2 þ n ð A Þ n ð B Þ
p ð A
/
B Þ¼ 2
where n(A) and n(B) are the number of eQTLs at some
predetermined significance level
for genes A and B,
respectively.
Incorporating other 'Omics' Data as Network
Priors in the Bayesian Network
Reconstruction Process
Just as genetic data can be incorporated as a network prior
in the Bayesian network reconstruction algorithm, so can
other types of data, such as transcription factor-binding site
(TFBS) data, protein
trans Þ¼ 1, but genes with trans eQTLs
are not allowed to be parents of genes with cis eQTLs,
p ð trans
/
cis Þ¼ 0. Second, after identifying all associ-
ations between different genetic loci and expression traits
at some reasonable significance threshold, genes from
this analysis with cis or trans eQTL can be tested indi-
vidually for pleiotropic effects at each of their eQTL to
determine whether any other genes in the set are driven
by common eQTL [44,45] . If such pleiotropic effects are
detected, the corresponding gene pair and locus giving
rise to the pleiotropic effect can then be used to infer
a causal/reactive or independent relationship based on the
causality test described above. If an independent rela-
tionship is inferred, then the prior probability that gene A
is a parent of gene B can be scaled as
/
e
protein interaction (PPI) data, and
protein
small molecular interaction data. PPI data can be
used to infer protein complexes to enhance the set of
manually curated protein complexes [46] . PPI-inferred
protein complexes can be combined with manually curated
sets, and each protein complex can then be examined for
common transcription factor-binding sites at the corre-
sponding genes. If some proportion of the genes in a protein
complex (e.g., half) carry a given TFBS, then all genes in
the complex can be included in the TFBS gene set as being
under the control of the corresponding transcription factor.
Given that the scale-free property is a general property
of biological networks (i.e., most nodes in the network are
linked to a small number of nodes, whereas a smaller
number of nodes are linked to many nodes) [47] , inferred
and experimentally determined TFBS data can be incor-
porated into the network reconstruction process by
e
P i
P ð A
t
B j A
;
B
;
l i Þ
P i
p ð A
/
B Þ¼ 1
1
Search WWH ::




Custom Search