Probabilistic Reasoning - Advanced Artificial Intelligence

Information Technology Reference

In-Depth Information

(1) There is no missing data in sample D, or D is complete;

(2) Parameter

vectors

are

mutually

independent,

viz.

∏∏

(

)

(

)

. This is called parameter independence.

Under the above assumptions, for given random sample D, parameters are

independent:

∏∏

h )=

h )

(

θs| D

( ȶ ij |

(6.37)

Then we can update each parameter vector ȶ ij independently. Suppose each

parameter ȶ ij

has

the

prior

distribution

Dirichlet

distribution

Dir( ȶ ij |

, ? ,

), we get the posterior distribution:

ijr

h )=

( ȶ ij |

Dir

( ȶ ij |

, ? ,

)

(6.38)

ijr

= pa i j .

Now we can make interested prediction by seeking the mean of possible

x i k and Pa i

where

N ijk is the number of cases in D that satisfy

X i =

θ s .

= ∏

h )=

(

)

For example, for the

+1th case,

(

x N +1 |

D,S

. According to

ijk

(

D S

)

the parameter independence given D, we can calculate the expectation:

= ∏

= ∏ Ð

h )=

h )d ȶ =

h )d ȶ ij

(

x N +1 |

D,S

· p

(

θ s |

D,S

· p

( ȶ ij |

D,S

ijk

and finally get:

ijk

∏ =

ijk

h )=

(

x N+1 |

(6.39)

Ã =

where

1 . Because unrestricted multinomial

distribution belongs to exponential family, the above computation is rather easy.

A Bayesian network with respect to variables X represents the joint distribution

of X . So, no matter a Bayesian network is constructed from prior knowledge, or

data, or integration of them, in principle, it can be used to deduce any interested

probability. Yet, the precise or even approximately precise reasoning on a

Bayesian network with discrete variables is NP hard. Current solution is to

simplify computation based on some conditional independence, or to construct

simple network topology for some specific reasoning problem, or to simplify the

network structure at the cost of less precision loss. Even though, it often requires

considerable computation to construct a Bayesian network. For some problem,

such as naïve Bayesian classification, using conditional independence can largely

reduce computation without losing much precision.

1 α

and

ijk

Advanced Artificial Intelligence

Search WWH ::

Custom Search

Home