Graphics Reference
In-Depth Information
1,0
Approac
h
1
Appr
o
ach 2
0,8
0,6
0,4
0
1
False Acceptance Rate
Figure 5.21
ROC Curves
Glossary
In the following we provide a brief introduction to the machine learning algorithms used in
the manuscript. For more details, please refer to some textbooks on machine learning such as
Bishop (2006).
AdaBoost
AdaBoost is a very successful machine-learning method that permits to build an accurate
prediction rule, its principle is based on finding many rough rules of thumb instead of finding
a one highly accurate rule. More simpler, the idea is to build a strong classifier by combining
weaker ones. AdaBoost is proven to be an effective and powerful classifier in the category of
ensemble techniques. The algorithm takes as input a training examples (
x
1
,
,...,
(
x
N
,
y
1
)
y
N
)
where each
x
i
(
i
N
) is an example that belongs to some domain or instance space
X
, and each label
y
i
is a boolean value that belongs to the domain
Y
=
1
,...,
={−
1
,
+
1
}
, indicating
whether
x
n
is positive or negative example. Along a finite number of iterations
t
T
the algorithm calls, at each iteration
t
, the weak classifier (or learner). After
T
times it
generates a set of hypothesis
=
1
,
2
,...,
T
t
=
1
. The final classifier
H
(
X
)
is the strongest one, and is given by the combination of these hypothesis, ponderated by
their respective weight factors
{
h
t
}
such that
h
t
−→ { −
1
,
1
}
T
α
t
are determined at each iteration
t
, the selection of the best hypothesis
h
t
, at each time
t
,is
done among a set of hypothesis
{
α
t
}
t
=
1
. The hypothesis
h
t
and its corresponding weight
J
j
=
1
, where
J
stands for the number of features consid-
ered for the classification task.
h
t
is equal to
h
j
that gives the smallest error of classifica-
tion
{
h
j
}
j
corresponds to samples that are misclassified, and that will see their
associated weight increased in the next iteration
t
j
. The error
+
1. These procedures are presented in
Algorithm 9.