Graphics Reference
In-Depth Information
For example, considering a feature A , data D is split by A into p partitions
D 1 ,
D 2 ,...,
D p , and C the number of classes. The information for D at the root
amounts to
C
I
(
D
) =−
P D (
c i )
log 2 P d (
c i ),
i
=
1
the information for D j due to partitioning D at A is
C
D j ) =−
I
(
P D j (
c i )
log 2 P D j (
c i ),
i = 1
and the information gain due to the feature A is defined as
p
|
D j |
|
D j ),
IG
(
A
) =
I
(
D
)
I
(
D
|
j = 1
where
|
D
|
is the number of instances in D , and P D (
c i )
are the prior probabilities for
data D .
Information gain has a tendency to choose features with more distinct values.
Instead, information gain ratio was suggested in [ 43 ] to balance the effect of many
values. It is worthy mentioning that it is only applied to discrete features. For con-
tinuous ones, we have to find a split point with the highest gain or gain ratio among
the sorted values in order to split the values into two segments. Then, information
gain can be computed as usual.
7.2.2.2 Distance Measures
Also known as measures of separability, discrimination or divergence measures . The
most typical is derived from distance between the class conditional density functions.
For example, in a two-class problem, if D
(
A
)
is the distance between P
(
A
|
c 1 )
and
P
(
A
|
c 2 )
, a feature evaluation rule based on distance D
(
A
)
states that A i is chosen
instead A j if D
. The rationale behind this is that we try to find the
best feature that is able to separate the two classes as far as possible.
Distance functions between the prior and posterior class probabilities are similar
to the information gain approach, except that the functions are based on distances
instead of uncertainty. Anyway, both have been proposed for feature evaluation.
Two popular distance measures are used in FS: directed divergence DD and
variance V . We show their computation expressions as
(
A i )>
D
(
A j )
P
P
log P
(
c i |
A j
=
a
)
DD
(
A j ) =
(
c i |
A j
=
a
)
(
A j
=
a
)
dx
.
P
(
c i )
 
 
Search WWH ::




Custom Search