Graphics Reference
In-Depth Information
Table 7.2 Accuracy metric and derivatives for a two-class (positive class and negative class)
problem
Mathematical form
tp
+
fp
Accuracy
tp
+
tn
+
fp
+
fn
Error rate
1 Accuracy
2
n
(
fp
×
fn
tp
×
tn
)
Chi-squared
(
tp
+
fp
)(
tp
+
fn
)(
fp
+
tn
)(
tn
+
fn
)
) ( tp + fp ) e ( tp , fp ) + ( tn + fn ) e ( fn , tn )
tp
Information gain
e
(
tp
+
fn
,
fp
+
tn
+
fp
+
tn
+
fn
x
x + y log 2
x
x + y
y
x + y log 2
y
x + y
where e ( x , y ) =−
tpr
tpr
fpr
tp
×
tn
Odds ratio
=
1
1
fpr
fp
×
fn
tpr
fpr
Probability ratio
7.2.3 Filter, Wrapper and Embedded Feature Selection
It is surely the most known and employed categorization made in FS methods for
years [ 33 ]. In the following, we will detail the three famous categories of feature
selectors: filter, wrapper and embedded.
7.2.3.1 Filters
There is an extensive research effort in the development of indirect performance mea-
sures, mostly based on the four evaluation measures described before (information,
distance, dependency and consistency), for selecting features. This model is called
the filter model.
The filter approach operates independently of the DM method subsequently
employed. The name “filter” proceeds from filtering the undesirable features out
before learning. They use heuristics based on general characteristics of the data to
evaluate the goodness of feature subsets.
Some authors differentiate a sub-category from filtering called rankers. It includes
methods that apply some criteria onwhich to score each feature and provide a ranking.
Using this ordering, the following learning process or user-defined threshold can
decide the number of useful features.
The reasons that influence the use of filters are those related to noise removal,
data simplification and increasing the performance of any DM technique. They are
prepared for dealing with high dimensional data and provide general subsets of
features that can be useful for any kind of learning process; rule induction, bayesian
models or ANNs .
A filter model of FS consists of two stages (see Fig. 7.2 ): (1) FS using measures
such as information, distance, dependence or consistency, with independence of the
learning algorithm; (2) learning and testing, the algorithm learns from the training
data with the best feature subset obtained and tested over the test data. Stage 2 is the
 
 
Search WWH ::




Custom Search