Bayesian Networks in the Absence of Temporal Information - Bayesian Networks in R: With Applications in Systems Biology

Biology Reference

In-Depth Information

causation (IC) algorithm ( Verma and Pearl , 1991 ) provides a framework for learning

the structure of Bayesian networks using conditional independence tests.

The details of the IC algorithm are described in Algorithm 2.1 . The first step

identifies which pairs of variables are connected by an arc, regardless of its direc-

tion. These variables cannot be independent given any other subset of variables,

because they cannot be d-separated. This step can also be seen as a backward selec-

tion procedure starting from the saturated model with a complete graph and pruning

it based on statistical tests for conditional independence.

The second step deals with the identification of the v-structures among all the

pairs of nonadjacent nodes A and B with a common neighbor C . By definition,

v-structures are the only fundamental connection in which the two nonadjacent

nodes are not independent conditional on the third one. Therefore, if there is a sub-

set of nodes that contains C and d-separates A and B , the three nodes are part of

a v-structure centered on C . This condition can be verified by performing a condi-

tional independence test for A and B against every possible subset of their common

neighbors that includes C . At the end of the second step, both the skeleton and the

v-structures of the network are known, so the equivalence class the Bayesian net-

work belongs to is uniquely identified.

The third and last step of the IC algorithm identifies compelled arcs and orients

them recursively to obtain the completed partially DAG (CPDAG) describing the

equivalence class identified by the previous steps.

A major problem of the IC algorithm is that the first two steps cannot be applied

in the form described in Algorithm 2.1 to any real-world problem due to the expo-

nential number of possible conditional independence relationship. This has led to

the development of improved algorithms such as the following:

•

PC : the first practical application of the IC algorithm ( Spirtes et al. , 2001 ), a

backward selection procedure from the saturated graph

•

Grow-Shrink (GS): based on the Grow-Shrink Markov blanket algorithm

( Margaritis , 2003 ), a simple forward selection Markov blanket detection ap-

proach

•

Incremental Association (IAMB): based on the Incremental Association Markov

blanket algorithm ( Tsamardinos et al. , 2003 ), a two-phase selection scheme

based on a forward selection followed by a backward one

•

Fast Incremental Association (Fast-IAMB): a variant of IAMB which uses spec-

ulative stepwise forward selection to reduce the number of conditional indepen-

dence tests ( Yaramakala and Margaritis , 2005 )

•

Interleaved Incremental Association (Inter-IAMB): another variant of IAMB

which uses forward stepwise selection ( Tsamardinos et al. , 2003 ) to avoid false

positives in the Markov blanket detection phase

All these algorithms, with the exception of PC, first learn the Markov blanket of

each node in the network. This preliminary step greatly simplifies the identifica-

tion of neighbors of each node, as the search can be limited to its Markov blanket.

As a result, the number of conditional independence tests performed by the learning

algorithm and its overall computational complexity are significantly reduced.

Bayesian Networks in R: With Applications in Systems Biology

Search WWH ::

Custom Search

Home