Database Reference
In-Depth Information
Note that in case of a probabilistic classifier, the crisp classification
y k ( x ) is usually obtained as follows:
P M k ( y = c i |
y k ( x ) = arg max
c i ∈dom ( y )
x ) ,
(9.3)
P M k ( y = c
where M k denotes classifier k and
|
x ) denotes the probability of
y obtaining the value c given an instance x .
9.3.1.2 Performance Weighting
The weight of each classifier can be set proportional to its accuracy
performance on a validation set [Opitz and Shavlik (1996)]:
(1
E i )
α i =
,
(9.4)
T
(1
E j )
j =1
where E i is a normalization factor which is based on the performance
evaluation of classifier i on a validation set.
9.3.1.3 Distribution Summation
The idea of the distribution summation combining method is to sum up
the conditional probability vector obtained from each classifier [ Clark and
Boswell (1991) ] . The selected class is chosen according to the highest value
in the total vector. Mathematically, it can be written as:
P M k ( y = c i |
Class ( x ) = argmax
c i ∈dom ( y )
x ) .
(9.5)
k
9.3.1.4 Bayesian Combination
In the Bayesian combination method the weight associated with each
classifier is the posterior probability of the classifier given the training set
[Buntine (1990)].
P M k ( y = c i |
Class ( x ) = argmax
c i ∈dom ( y )
P ( M k |
S )
·
x ) ,
(9.6)
k
where P ( M k |
S ) denotes the probability that the classifier M k is correct
given the training set S . The estimation of P ( M k |
S ) depends on the
classifier's representation. To estimate this value for decision trees the
reader is referred to [ Buntine (1990) ] .
Search WWH ::




Custom Search