Information Technology Reference
In-Depth Information
Because maximum-margin hyper-plane is defined by weight vector W, it is easy
to recognize that the essence of constructing maximum-margin hyper-plane is to
solve the following constrained optimization problem:
1
|
W
|
2
Minimize
b
subject to y i (W T
X i + b)
1 ,
2
W ,
This constraints can be re-written as below:
1
2
Minimize
b
|
W
|
subject to 1 - y i ( W T
i
X i + b )
0 ,
(4)
2
W ,
1
2
|
W
|
The reason of minimize
is that distance between two parallel hyper-
2
2
W
planes is
and we need to maximize such distance in order to maximize the
|
|
2
W
margin for maximum-margin hyper-plane. Then maximizing
is to
|
|
1
|
W
|
minimize
. Because |W| is the norm of W being complex to compute, we
2
1
1
2
|
W
|
|
W
|
substitute
with
. This is the constrained optimization problem or
2
2
quadratic programming (QP) optimization problem. Note that | W | 2 can be
computed by scalar product of it and itself:
| W | 2 = W T
W
The way to solve QP problem in formula III.3.2 is through its Lagrange dual.
Suppose we want to minimize f(x) subject to g(x)=0 , it exists a solution x 0 to set of
equations:
(
f
(
x
)
+
α
g
(
x
))
=
0
x
=
x
0
x
g
(
x
)
=
0
where
is Lagrange multiplier.
For multi constraints g i (x) = 0 (i=
α
1
n
), there is a Lagrange multiplier for each
constraint
n
(
f
(
x
)
+
α
g
(
x
)
=
0
i
i
x
i
=
1
x
=
x
0
g
(
x
)
=
0
i
 
Search WWH ::




Custom Search