Biomedical Engineering Reference
In-Depth Information
subject to
n
α i y i =0
i =1
0
α i
C
i =1 ,...,n
Note that the explicit definition of the nonlinear function φ (
), has been cir-
cumvented by the use of a kernel function, defined formally as the dot products
of the nonlinear functions
·
K ( x i , x j )= φ ( x i ) ( x j )
(3.37)
This method is usually attributed to Mercer's theorem (Cristianini and Shawe-
Taylor, 2000), which implicitly computes the inner product of the vectors with-
out explicitly defining the mapping of inputs to higher dimensional space since
they could be infinitely dimensioned. The use of kernels will not be discussed
in detail here, but the interested reader is referred to Schlkopf et al. (1999);
Schlkopf and Smola (2002). Many kernels are proposed in the literature, such
as the polynomial, radial basis function (RBF), and Gaussian kernels, which
will be explicitly defined later. The trained classifier (machine) then has the
following form:
f ( x ) = sign
α i y i K ( x , x i )+ b
n
(3.38)
i =1
DEFINITION 3.3.1 (Karush-Kuhn-Tucker Optimality Conditions) The
solution (Vapnik, 2000) to the SVM classification optimization problem is
found when the following Karush - Kuhn - Tucker (KKT) conditions are met
for all α i , where i =1 ,...,n.
y i f ( x i ) > 1 f α i =0
y i f ( x i ) < 1 f α i = C
y i f ( x i )=1 if 0 i <C
The preceding solution is usually sparse in that most of the Lagrangian
multipliers end up being zero, that is, α i = 0. This causes the corresponding
product terms in Equation 3.38 to drop out and so the decision function could
be represented solely by nonzero α i . The training vectors that correspond to
nonzero α i are called the support vectors. Table 3.1 summarizes the training
examples according to the values of their multipliers α i .
3.3.2 Support Vector Regression
In the regression formulation the standard SVM framework is extended
to solve the more general function estimation problem. The data set in
Search WWH ::




Custom Search