Information Technology Reference
In-Depth Information
c
In the first method, the class
with maximum posterior probability will be
i
j
(
?
l
)
c
x
)
c j
x
selected, viz.
P
(
P
(
)
. In this case the decision function
. It has been proved that in this method the minimum
classification error can be guaranteed.
The second method is often used in decision theory. It utilizes average benefit
to evaluate decision risk, which has close relationship with degrees of uncertainty.
Let
r
(
x
)
=
p
(
c
x
)
is
i
i
c
L ij ( X ) be the loss of misclassifying a feature vector X of class
c i to class
.
j
l
The class with minimum loss of X is
à =
. In this
Minimize
{
L
(
x
)
P
(
c
|
x
)
ij
j
i
j
1
l
Ã
r x
( )
=
L
( )
x
P c
(
|
x
)
case, the decision function is
. If diagonal elements
i
ij
j
j
=
1
of
L ij ( X ) are all 1, viz. correct
classification makes no loss and misclassification has same loss, the first method
and the second method are equal.
In data mining, the research on Bayesian classification mainly focuses on how to
learn the distribution of feature vectors and the correlation among feature vectors
from data so that to find the best P(
L ij ( X ) are all 0 and non-diagonal elements of
L ij ( X )). By now successful models
have been proposed, including Naïve Bayesian, Bayesian Network and Bayesian
Neural Network. Bayesian classification method has been successfully applied to
many fields, such as text classification, alphabet recognition, and economic
prediction.
2. Bayesian method in casual reasoning and uncertain knowledge representation
Bayesian network is a graph to describe probabilistic relations of random
variables. These years, Bayesian network has been the primary method of
uncertain knowledge representation in expert system. Many algorithms have
been proposed to learn Bayesian network from data. These techniques have
gained reasonable success in data modeling, uncertainty reasoning and so on.
Compared to other knowledge representation method in data mining, such as
rule representation, decision tree, artificial neural networks, Bayesian network
possesses the following merits in knowledge representation (Cooper, 1992):
c
i |
x
) and
(1) Bayesian network can conveniently handle incomplete data. For example,
when we face the classification or regression problem with multiple correlative
variables, the correlation among variables is not the key element for standard
supervised learning algorithms. As a result, missing values will cause large
predictive bias. Yet Bayesian network can handle incomplete data with the
Search WWH ::




Custom Search