Bayesian Networks in the Absence of Temporal Information - Bayesian Networks in R: With Applications in Systems Biology

Biology Reference

In-Depth Information

may also be applied to deal with continuous data when one or more variables present

severe departures from normality (skewness, heavy tails, etc.).

The intervals the variables will be discretized into can be chosen in one of the

following ways:

•

Using prior knowledge on the data. The boundaries of the intervals are defined,

for each variable, to correspond to significantly different real-world scenarios,

such as the concentration of a particular pollutant (absent, dangerous, lethal) or

age classes (child, adult, elderly).

•

Using heuristics before learning the structure of the network. Some examples are

Sturges, Freedman-Diaconis, or Scott rules ( Venables and Ripley , 2002 ).

•

Choosing the number of intervals and their boundaries to balance accuracy and

information loss ( Kohavi and Sahami , 1996 ), again one variable at a time and

before the network structure has been learned. A similar approach considering

pairs of variables is presented in Hartemink ( 2001 ).

•

Performing learning and discretization iteratively until no improvement is made

( Friedman and Goldszmidt , 1996 ).

These strategies represent different trade-offs between the accuracy of the discrete

representation of the original data and the computational efficiency of the transfor-

mation.

2.3 Static Bayesian Networks Modeling with R

In this section, we demonstrate structure learning, parameter learning, and manip-

ulation of a static Bayesian network in the R environment. Several of the packages

introduced in Sect. 2.3.1 will be covered to provide an overview of the possibilities

offered by the R environment. All code will be illustrated using a very simple data

set and explained step by step to develop a throughout understanding of Bayesian

network learning.

2.3.1 Popular R Packages for Bayesian Network Modeling

There are several packages on CRAN dealing with Bayesian networks. They can

be divided in two categories: those that deal with structure learning and those that

focus only on parameter learning and inference (Table 2.1 ).

Packages bnlearn ( Scutari , 2010 , 2012 ), deal ( Bøttcher and Dethlefsen , 2003 ),

pcalg ( Kalisch et al. , 2012 ), and catnet ( Balov and Salzman , 2012 ) fall into the

first category. bnlearn offers a wide variety of structure learning algorithms (span-

ning all the three classes covered in this chapter, with the tests and scores covered

in Sect. 2.2.4 ), parameter learning approaches (maximum likelihood for discrete

and continuous data, Bayesian estimation for discrete data), and inference tech-

Search WWH ::

Custom Search

Home