Applications - Minimum Error Entropy Classification

Information Technology Reference

In-Depth Information

Let us now consider non-singleton sets, B i .Wefirstnoticethat:

P (

{

B 1 ,B 2 }|

ω )) = P ( B 1 |

ω )+ P ( B 2 |

ω ) whenever B 1 ∩

B 2 =

∅

(6.58)

Therefore, for any non-singleton B i ,wemaywrite:

P ( B i | 1) =

b∈B i

P ( B i | 0) =

b∈B i

P ( b| 1);

P ( b| 0) .

(6.59)

Thus, once P 10 and P 01 have been computed for all m singleton sets, it is an

easy task to compute them for any non-singleton category set B i ,usingfor-

mulas (6.57) and (6.59), since for both theoretical and empirical probability

mass functions pertaining to B i :

P 10 = p (1

−

P ( B i |

1));

P 01 = qP ( B i |

0) .

(6.60)

6.6.1.3

Error Entropy of Class Unions

Since the MEE approach is a two-class discrimination approach, one expects

to obtain performance improvements for datasets with c> 3 classes by con-

sidering class unions, i.e., by including merged classes, say of k classes with

k up to

in the set of candidate classes. The diculty thereof is that

the number of candidate classes may become quite high. There is, however,

a fast way of computing the dichotomous decision errors for unions of classes

as we shall now show.

Consider the class union ω = ω 1 ∪ ω 2 , ω 1 ∩ ω 2 = ∅ , and suppose that

we have computed the following three quantities for ω i , ( i =1 , 2) and for a

decision rule r :

c/ 2

1. n 10 ( ω i , r ) — number of instances of class i that do not satisfy the rule;

2. n 11 ( ω i ,r ) — number of instances of class i that satisfy the rule;

3. n r — number of instances (from whatever class) that satisfy the rule.

P 10 ( ω i ) (i.e., P 10 for the ω i vs ω i decision) is, as before, simply n 10 ( ω i , r ) /n .

Let us now see how one can compute P 01 ( ω i ) using n 11 ( ω i ,r ) instead of

n 01 ( ω i ,r ).

First, notice that:

P ( r )= P ( ω i ,r )+ P ( ω i ,r );

(6.61)

ω i )= n 11 ( ω i ,r )

P ( ω i ,r )= P ( ω i ) P ( r

(6.62)

where n is the total number of instances at the node. From formulas (6.61)

and (6.62) we derive:

Minimum Error Entropy Classification

Search WWH ::

Custom Search

Home