Information Technology Reference
In-Depth Information
architecture does not need it and can further output pose information in addition
to the face recognition result, and its classification accuracy is higher than that
of the conventional methods supplied with precise pose information. Moreover,
it is not a surprise that in various real-world data-mining competitions, such as
KDD-Cup 1 and Netflix Prize, 2 almost all the top algorithms exploited ensemble
methods in recent years.
In class imbalance learning (CIL), ensemble methods are broadly used to fur-
ther improve the existing methods or help design brand new ones. A famous
example is the ensemble method designed by Viola and Jones [2, 3] for face
detection. Face detection requires to indicate which parts of an image contain a
face in real-time. A typical image has about 50,000 sub-windows to represent
different scales and locations [2], and each one of them should be determined
whether it contains a face or not. Typically, there are only a few dozen faces
among these sub-windows in an image. Furthermore, there are often more non-
face images than images containing any face. Thus, non-face sub-windows could
be 10 4 times more than sub-windows containing any face. Viola and Jones [2, 3]
designed a boosting-based ensemble method to deal with the severe class imbal-
ance problem. This method, together with a cascade-style learning structure, is
able to achieve very high detection rate while keeping very low false positive
rate. This face detector is recognized as one of the breakthroughs in the past
decades. Besides, ensemble methods have been used to improve over-sampling
[4] and under-sampling [5, 6], and a number of boosting-based methods have
been developed to handle class-imbalanced data [2, 4, 7, 8].
We introduce the notations used in this chapter. By default, we talk about
binary classification problems. Let D ={ ( x i ,y i ) }
n
i =
1 be the training set, with
y ∈{− 1 , 1 } . The class with y = 1 is the positive class with n + examples, and
suppose it is the minority class; the class with y =− 1 is the negative class with
n examples, and suppose it is the majority class. So we have n + <n ,and
the level of imbalance is r = n /n + . The subset of training data containing all
the minority class examples is
P
, and the subset containing all the majority class
examples is
N
. Assume that data is independent and identically sampled from
distribution
D
on
X × Y
, where
X
is the input space and
Y
is the output space.
A learning algorithm L trains a classifier h :
X Y
.
4.2 ENSEMBLE METHODS
The most central concept of machine learning is generalization ability, which
indicates how well the unseen data could be predicted by the learner trained
1 KDD-Cup is the most famous data-mining competition that covers various real-world applications,
such as network intrusion, bioinformatics, and customer relationship management. For further details,
refer to http://www.sigkdd.org/kddcup/
2 Netflix is an online digital video disk (DVD) rental service. Netflix Prize is a data-mining compe-
tition held every year since 2007 to help improve the accuracy of movie recommendation for users.
For further details, refer to http://www.netflixprize.com/
Search WWH ::




Custom Search