Bayesian Networks in the Presence of Temporal Information - Bayesian Networks in R: With Applications in Systems Biology

Biology Reference

In-Depth Information

Assumption 2 is also satisfied by construction provided the error covariance ma-

trix

is diagonal (see Lebre 2009 ). Assuming uncorrelated errors between different

variables may not necessarily hold in real-world scenarios. Nevertheless, it is not

unreasonable. Assumption 3 is difficult to verify, but it is not too restrictive if the

variables included in the data set are distinct. Then from Theorem 3.1 ,aVAR(1)

process whose error covariance matrix

Σ

is diagonal can be represented by a dy-

namic Bayesian network whose arcs are identified by the nonzero elements of A .

For an illustration, any VAR(1) process with diagonal

Σ

where matrix A has the

following form (where the elements a ij refer to nonzero coefficients),

Σ

⎛

⎞

a 11 a 12 0

a 21

⎝

⎠ ,

A

=

00

(3.14)

0

a 32 0

can be represented by the dynamic network in Fig. 3.2 c. For instance, the non-zero

coefficient a 12 implies the arc from X 2 to X 1 in Fig. 3.2 a.

3.3 Dynamic Bayesian Network Learning Algorithms

Several approaches have been covered in Chap. 2 for static Bayesian networks.

Learning a dynamic Bayesian network defining a VAR model from the given data

is a very different process and amounts to identifying the nonzero coefficients of

the auto-regressive matrix A . Under the homogeneity assumption (Assumption 4 in

Sect. 3.2.1 ), repeated time measurements can be used to perform linear regression.

Let k be the number of variables under study. Then each variable X i , i

=

,...,

1

k in

a VAR(1) process satisfies

k

j = 1 a ij X j ( t − 1 )+ b i + ε i ( t )

X i (

t

)=

where

ε i (

t

) ∼

N

(

0

, σ i (

t

)) .

(3.15)

However, the classic ordinary least square estimates of the regression coefficients a ij

and b i can be computed only when n

k , thus ensuring that the sample covariance

matrix has full rank. For real-world data, regularized estimators are required in most

cases.

3.3.1 Least Absolute Shrinkage and Selection Operator

The Least Absolute Shrinkage and Selection Operator or LASSO ( Tibshirani 1996 )

is a standard procedure, first applied to network inference by Meinshausen and

Buhlman ( 2006 ). This constrained estimation procedure tends to produce some

coefficients that are exactly zero by applying an L 1 norm penalty to their sum.

Variable

selection

is

then

straightforward:

only

nonzero

coefficients

define

Bayesian Networks in R: With Applications in Systems Biology

Search WWH ::

Custom Search

Home