Information Technology Reference
In-Depth Information
By solving Equation 5.7 and finding the optimal values for α i , w can be
recovered as in Equation 5.8
l
w
=
α i y i φ(x i )
(5.8)
i
=
1
and b can be determined from the KKT conditions given in Equation 5.5. The
data points having nonzero α i values are called support vectors . Finally, the SVM
decision function can be given by:
sign l
α i y i K(x i ,x) + b
f(x) =
sign (w · (x) + b) =
(5.9)
i
=
1
5.3 SVMs AND CLASS IMBALANCE
Although SVMs often produce effective solutions for balanced datasets, they
are sensitive to the imbalance in the datasets and produce suboptimal models.
Veropoulos et al. [8], Wu and Chang [9], and Akbani et al. [10] have studied this
problem closely and proposed several possible reasons as to why SVMs can be
sensitive to class imbalance, which are discussed as follows.
5.3.1 Weakness of the Soft Margin Optimization Problem
It has been identified that the separating hyperplane of an SVM model developed
with an imbalanced dataset can be skewed toward the minority class [8], and this
skewness can degrade the performance of that model with respect to the minority
class. This phenomenon can be explained as follows.
Recall the objective function of the SVM soft margin optimization problem,
which was given in Equation 5.3 previously.
min 1
ξ i
l
2 w · w + C
i
=
1
s . t . i (w · (x i ) + b) 1 ξ i
(5.10)
ξ i 0 ,i = 1 ,...,l
The first part of this objective function focuses on maximizing the margin,
while the second part attempts to minimize the penalty term associated with the
misclassifications, where the regularization parameter C can also be considered
as the assigned misclassification cost. Since we consider the same misclassifi-
cation cost for all the training examples (i.e., same value of C for both positive
and negative examples), in order to reduce the penalty term, the total number of
Search WWH ::




Custom Search