Biology Reference
In-Depth Information
seed numbers (1000 random seed numbers are generated by
a master process, then each slave process starts an MCMC
process using one of the generated seed numbers). Once the
1000 network structures have been generated, common
features are extracted to derive a consensus network. With
this construction, the consensus network may contain
loops, which are prohibited in Bayesian networks. There-
fore, to ensure the consensus network structure is a directed
acyclic graph, the edges in the original consensus network
are removed if and only if (1) the edge was involved in
a loop, and (2) the edge was the most weakly supported of
all edges making up the loop. The network resulting from
this process is depicted in Figure 26.4 b.
process applied to trait data only described in step 2. In
addition to trait data, priors derived from other data types
are also input into the standard Bayesian network recon-
struction process. The trait data of the 18 nodes and related
priors are input into the network reconstruction process,
and the resulting network is shown in Figure 26.4 d. The
root node of the Bayesian network is URA3, which is the
gene with the cis-acting eQTL associated with other traits
in the network.
Step 6: Comparing the Networks Constructed
in Steps 2 and 5
The main difference between the networks depicted in
Figure 26.4 b and 4d is the head nodes. In general, directed
links in a Bayesian network do not necessarily represent
causal relationships [24] . The network constructed from the
trait data only reflects relationships not supported by the
genetic perturbation data. The genetic relationships are
well captured by the more integrated network described in
Step 5. For example, the link RIB4
Step 3: Constructing Priors Using eQTL Data
The network in step 2 is constructed without considering any
of the genetic data. Because eQTL data represent a system-
atic source of perturbation on the expression data, integrating
these data has the potential to better resolve causal rela-
tionships. Towards this end, expression and genotype data in
the BXR cross are compared to detect eQTLs. The red nodes
in Figure 26.4 a indicate that nearly all of the nodes have
QTLs linked to a single locus on chromosome V. Expression
traits that associate with a common eQTL are then subjected
to a statistical test to infer causal relationships between the
traits, as described above. Among the nodes tested, URA3
and YEL016C have cis-acting eQTLs linked to the chro-
mosome V locus. Nodes with cis-acting eQTLs are allowed
to be causal parent nodes to nodes with trans-acting QTLs.
However, nodes with trans QTLs are not allowed to be causal
parent nodes to nodes with cis-acting eQTLs.
URA3 depicted in
Figure 26.4 b is opposite that identified in Figure 26.4 d.
Because the genetic perturbation at the URA3 locus affects
the expression activity of that gene in cis and the expression
activity of the gene RIB4 in trans, the experimentally
supported relationship is URA3
/
/
RIB4. Note that the
enzyme
metabolite relation-
ships are similar with or without the priors derived from the
KEGG pathways.
All data and software used to construct the Bayesian
networks for this example are available at http: //www.
mssm.edu/research/institutes/genomics-institute/rimbanet .
metabolite and metabolite
e
e
Step 4: Constructing Priors using KEGG Data
The network constructed in step 2 also does not consider
known relationships among genes and metabolites as
defined by canonical pathways. The relationships between
enzymes and metabolites are well established in many cases.
To incorporate this knowledge into the network recon-
struction process, we construct priors using canonical
pathway data in the following way. There are two metabo-
lites in the URA3 subnetwork. Their distances from each
other and from related enzymes are defined in the KEGG
database. The structure prior for the gene expression of an
enzyme e affecting a metabolite concentration is constructed
using their shortest distance d m ; e as p ð m
Networks Constructed from Human and
Animal Data Elucidate the Complexity
of Disease
We have carried out studies using the modeling described
in detail for the yeast cross, but in human and mouse
populations, segregating a number of different diseases
such as obesity, diabetes and heart disease. For example, in
a segregating mouse population in which an extensive suite
of disease traits associated with metabolic syndromes were
manifested, including obesity, diabetes, and atherosclerosis
[3] , we carried out the type of network analysis discussed
above using genetic data typed in all animals and gene
expression data generated from the liver and adipose tissues
of all animals in the population. With this approach we
found that of the many functional units (subnetworks)
identified in the networks that reflected core biological
processes specific to the liver and adipose tissues, only
a handful were strongly causally associated with the
metabolic syndrome traits. One module in particular stood
out, not only because it was conserved across the liver and
e l d m ; e .
/
e
Step 5: Constructing Networks using Expres-
sion Data, Metabolite Data, and the Genetic
and Canonical Pathway Priors defined in Steps
3 and 4
The process of reconstructing networks using trait data and
priors from other data types is similar to the reconstruction
Search WWH ::




Custom Search