Probabilistic Reasoning - Advanced Artificial Intelligence

Information Technology Reference

In-Depth Information

In the case of single variable multiple parameters (a single variable with

multiple possible states), commonly

is regarded as a continuous variable with

Gaussian distribution. Assume its physical density is

(

| ȶ ), then we have:

−

1 2

−

(

−

)

p x

(

)

(

πν

)

where ȶ = { , }.

Similar to previous approach on binary distribution, we firstly assign the

prior of parameters and then solve the posterior with the data

X 1 =x 1 ,

X 2 =x 2 ,…., X N =x

N } via Bayesian theorem.

( |

(

| ȶ )

( ȶ )/

(D)

Next, we use the mean of as the prediction:

p x

(

)

( ȶ | D)d ȶ (6.17)

For exponential family, the computation is effective and close. In the case of

multisamples, if the observed value of

(

x N +1 | )

is discrete, Dirichlet distribution can be

used as the prior distribution, which can simplify the computation.

The computational learning mechanism of Bayesian theorem is to get the

weighted average of the expectation of prior distribution and the mean of sample,

where the higher the precision, the bigger the weight. Under the precondition that

the prior is conjugate distribution, posterior information can be used as the prior

in the next round computation, so that it can be integrated with further obtained

sample information. If this process is repeated time after time, the effect of

sample will be increasingly prominent. Because Bayesian method integrates prior

information and posterior information, it can both avoid the subjective bias when

using only prior information and avoid numerous blind searching and

computation when sample information is limited. Besides, it can also avoid the

affect of noise when utilizing only posterior information. Therefore it is suitable

for problems of data mining with statistical features and problems of knowledge

discovery, especially the problems where sample is hard to collect or the cost of

collecting sample is high. The key of effective learning with Bayesian method is

determining prior reasonably and precisely. Currently, there are only some

principles for prior determination, and there is no operable whole theory to

determine priors. In many cases the reasonability and precision of prior

distribution is hardly to evaluate. Further research is required to solve these

problems.

6.4 Naïve Bayesian Learning Model

In naïve Bayesian learning models, training sample

is decomposed into feature

vector X and decision class variable

. Here, it is assumed that all the weights in

a feature vector are independently given the decision variable. In another word,

Advanced Artificial Intelligence

Search WWH ::

Custom Search

Home