Information Technology Reference
In-Depth Information
leaf is reached whenever a lower bound on the number of instances is reached or
an interval-end hit occurs, signaling a large distribution overlap (Sect. 4.1.3.2).
Instead of interval-end hits, other distribution overlapping criteria could be en-
visaged; for instance, one could attempt to detect a concave shape of EE (exem-
plified by Fig. 4.8b) as done in [151]. Final class label assignment of the leaves
is made by the usual process of majority voting. The algorithm has also to take
care of the following three issues specific of these trees.
6.6.1.1
Position of the Candidate Class
For numerical features we have seen in Sect. 4.1.4.2 that EE depends on the
position of the class (i.e., on whether we decide ω k for u jl or u jr ). Thus,
error rates must be computed for two class configurations: the candidate
class ( t =1)corresponds to the x ij j rule; the candidate class ( t =1)
corresponds to the x ij
Δ j rule. Note that the candidate class position
achieving MEE is not known a priori .
It is, however, a trivial task to compute EE for one of the configurations,
when the error rates have been computed for the other. Let us assume that
the computation for the x>Δ configuration has been carried out for a nodal
set with n instances. In pseudo code: rule
( x>delta ); n 10
sum (not
rule and t ); n 01
sum ( rule and not t ). Using these n 's one then computes
their values for the other configuration simply as: n 1
sum ( t ); n 10
n 1
n 10 ; n 01
n 01 .
The class position issue does not apply to categorical variables.
n
n 1
6.6.1.2
Handling Categorical Variables
MEE trees handle subsets of categorical features, of a set B =
,
in a simple way: the full computation of the MEE split point has only to
be carried out for the m singleton categories (instead of having to perform
2 m computations). This represents a considerable time saving. We now prove
this. Denoting a singleton category set
{
b 1 ,
···
,b m }
{
b
}
by b and its complement, B
−{
b
}
,
by b , let us assume that we have computed
P 10 = P (1 , b ) ,
P 01 = P (0 ,b ) ,
(6.56)
where P (1 , b ) and P (0 ,b ) are short notation for P ( ω 1 ,x
∈{
b
}
) and P ( ω 0 ,x
. = x
{
b
}
), respectively (with ω 0
ω 1 ). The decision rule here is r
∈{
b
}
.
We have already seen in formula (4.46) that in order to compute EE we
only need the P 10 and P 01 values. These can also be used with class priors
p = P ( ω 1 ) and q =1
p to compute:
P ( b
|
1) = P 10 /p ;
P ( b
|
1) = 1
P 10 /p ;
P ( b
|
0) = P 01 /q .
(6.57)
 
Search WWH ::




Custom Search