Probabilistic Reasoning - Advanced Artificial Intelligence

Information Technology Reference

In-Depth Information

6.3.2 Computational learning

Learning is that a system can improve its behavior after running. Is the posterior

distribution gained via Bayesian formula better than its corresponding prior?

What is its learning mechanism? Here we analyze normal distribution as an

example to study the effect of prior information and sample data by changing

parameters.

Let

x 1 , x 2 , … , x n be a sample from normal distribution N( ȶ ,

), where

~ , the estimation of ȶ , we take another

normal distribution as the prior of ȶ . That is

( )=

is known and ȶ is unknown. To seek θ

σ ).

The resulting posterior distribution of ȶ is also normal distribution:

( 0 ,

( |

) = N( 1 ,

)

where

Ã =

(

)

(

)

−

Take 1 , the expectation of the posterior

( |

) as the estimation of

ȶ , we have:

(

)

⋅

=E( |

(6.8)

~ , the estimation of ȶ , is the weighted average of 0 , the

expectation of prior, and

Therefore,

, the sample mean.

is the variance of

( 0 ,

so its reciprocal, 1/

, is the precision of 0 . Similarly,

is the variance of

sample mean

, so its reciprocal is the precision of

. Hence, we see that

~ is the weighted average of 0 and

, where the weights are their precisions

respectively. The smaller the variance, the bigger the weight. Besides, the bigger

the sample size

, or the bigger the weight of

sample mean. This means that when n is quite large, the effect of prior mean will

be very small. Above analysis illustrate that the posterior from Bayesian formula

integrates the prior information and sample data. The result is more reasonable

than that based on merely prior information or sample data. The learning

mechanism is effective. The analysis based on other conjugate prior distribution

leads to similar result.

According to previous discussion, with the conjugate prior, we can use the

posterior information as the prior of next computation and seek the next posterior

by integrating more sample information. If we repeat this process time after time,

, the smaller the variance

Advanced Artificial Intelligence

Search WWH ::

Custom Search

Home