Biomedical Engineering Reference
In-Depth Information
participating in the regulation of gene g i . As such, the predictor does not consider the
type of regulation (repression versus activation), and is analogous to the support of
a function in logic synthesis. Each gene has a single predictor (which is a collection
of genes) and the predictor set is the set consisting of predictors of each gene in the
GRN.
There are several observations that impact the formulation of our GRN model and
predictor inference algorithm. First, the activity level (i.e. activation or repression)
of all genes at a particular time t represents the state of the GRN at that time t . From
our knowledge of biological systems, we observe that over time, cellular processes
converge to sequences of stable attractor states. Some of these attractor states repre-
sent normal cellular phenomena in biology (i.e. cell cycle and division), while other
attractor states are consistent with disease (i.e. the metastasis of cancer). Second,
the GRN is often inferred by observing microarray-based experimental data though
which the activity level of genes is measured. The observations of gene activity (or
state) can be used to infer the gene regulation network. The disadvantage of using mi-
croarray data is that such studies do not involve controlled time-series experimental
data. Hence the measurements are assumed to arise from cyclic sequences of gene
expressions (attractor states) in steady state. Such a sequence is referred to as an
attractor cycle . The GRN is then inferred from this data, using methods traditionally
based on probabilistic transition models [ 7 ], [ 8 ].
As previously mentioned, it is necessary to determine the predictor set in order
to reconstruct the GRN. However, there may exist many possible predictors for any
gene, based on the attractor cycle data. Furthermore, only certain combinations of
predictors may form a valid predictor set, due to biological constraints. The issue
addressed in this chapter is how to efficiently and deterministically select the predic-
tors that form the predictor set. We have implemented a Boolean satisfiability (SAT)
based algorithm for the inference of gene predictor sets. Satisfiability is a decision
problem of determining whether the variables in a Boolean formula (expressed in
Conjunctive Normal Form or CNF) can be assigned to make the formula evaluate
to true . Although SAT is NP-complete, many SAT solvers have been developed to
quickly and efficiently solve large SAT problems. Our algorithm takes advantage of
a recent SAT solver to find the predictor set.
The basic outline of our SAT-based algorithm for predictor set inference is de-
scribed briefly below. First, all possible orderings of attractor states are enumerated,
yielding all possible attractor cycles. For each ordering, we enumerate all predictors
that are logically valid, and create a CNF expression which encodes all these pre-
dictors and biological constraints (such as cardinality bounds on the predictors). A
SAT solver is then used to find the valid candidate predictor sets. After this process
is done iteratively for all attractor cycle (orderings), statistical analysis provides the
most likely predictor set. Note that the algorithm in this chapter does not claim to
extract the GRN. Using the predictor set inferred by this chapter, we can infer the
GRN in a subsequent step.
Search WWH ::




Custom Search