Inferring the Topology of Gene Regulatory Networks: An Algebraic Approach to Reverse Engineering - Mathematical Concepts and Methods in Modern Biology

Biology Reference

In-Depth Information

Draw the state space and wiring diagram of this stochastic PDS labeling the edges

with the corresponding probabilities.

Project 3.6. Consider the four time series in Section 4.3 of [ 30 ]. Find the ideal

of polynomials that vanish on the series and using the software package Gfan [ 31 ],

compute its Gröbner fan.

3.6 DISCRETIZATION

For reasons explained at the beginning of Section 3.2 , we have been assuming that

the experimental data we use for reverse engineering have already been discretized

into a (small) finite number of states. Typically, however, experimental measurements

come to us represented by computer floating point numbers and consequently data

discretization is in fact part of the modeling process and can be viewed as a prepro-

cessing step. We will use the definition of discretization presented in [ 35 ].

Definition 3.11. A discretization of a real-valued vector v

= (v 1 ,...,v N )

is an

= (

d 1 ,...,

d N )

integer-valued vector d

with the following properties:

1. Each element of d is in the set 0

1 for some (usually small) positive

integer D , called the degree of the discretization.

2. For all 1

,...,

−

≤

v i

≤ v j .

N ,wehave d i

d j if and only if

,v i v j .

Spanning discretizations of degree D satisfy the additional property that the smallest

element of d is equal to 0 and that the largest element of d is equal to D

Without loss of generality, assume that v is sorted, i.e., for all i

There is no universal way for data discretization that works for all data sets and

all purposes. Sometimes discretization is a straightforward process. For example, if a

gene expression time series has a sigmoidal shape, e.g.,

−

(

)

,itis

reasonable to discretize it as

. More complicated expression profiles may

be easy to discretize too and it is often true that the human eye is the best discretization

“tool” whose abilities to discern patterns cannot be reproduced by any software.

Large data sets, on the other hand, do require some level of automatization in the

discretization process. Regardless of the particular situation, it is good practice to look

at the data first and explore for any patterns that may helpwith the discretization before

inputting the data into any discretization algorithm. Afterwards, the way you choose

to discretize your data, which includes selecting the number of discrete states, should

depend on the type and amount of data and the specific reason for discretization. Below

we present several possible approaches which by no means comprise a complete list.

Binary discretizations are the simplest way of discretizing data, used, for instance,

for the construction of Boolean network models for gene regulatory networks [ 36 , 37 ].

The expression data are discretized into only two qualitative states as either present

or absent. An obvious drawback of binary discretization is that labeling real-valued

data according to a present/absent scheme may cause the loss of large amounts of

information.

(

)

Mathematical Concepts and Methods in Modern Biology

Search WWH ::

Custom Search

Home