Probabilistic Reasoning - Advanced Artificial Intelligence

Information Technology Reference

In-Depth Information

Based on the above independence assumption and Bayesian formula,

equation (6.54) can be rewritten as:

(

)

(

)

(

)

(

)

(6.55)

∏

(

)

(

)

∏

(

)

(

)

The learning task becomes to learn parameters of model from prior information

in training data. Here we adopt multinomial distribution and Dirichlet conjugate

distribution.

Ã =

(

)

(

)

(6.56a)

c j

(

)

(

)

(

)

(6.56b)

Ã Ã

(

)

(

)

= Ã

( )

⋅

where

is the super-parameter of model;

is the class

labeling function

(

a = b

) is characteristic function (if

a = b

, then

(

a = b

)=1;

otherwise

)=0).

Although the applicable condition for naïve Bayesian model is somewhat

harsh, numerous experiments demonstrate that even when independence

assumption is unsatisfied, naïve Bayesian model can still work robustly. It has

been one of the most popular methods for text classification.

Below we will classify unlabeled documents according to MAP criterion

based on the knowledge in these unlabeled documents.

Consider the entire sample set

(

a = b

D L is the set of documents

that has been labeled in the first stage. Assume that the generation of all samples

in D is mutually independent; then the following equation holds:

∏

D L ∪ D U , where

∏ Ã

(

)

(

)

(

)

⋅

(

)

(

)

∈

(6.57)

In the above equation, unlabeled documents are regarded as mix model. Our

learning task is to gain the maximum estimation of model parameter ȶ with the

sample set

. according to Bayesian theorem, we have:

(

)

(

)

(

)

(6.58)

(

)

Advanced Artificial Intelligence

Search WWH ::

Custom Search

Home