Information Technology Reference
In-Depth Information
Suppose
λ
*
is a solution of (III.3.1) with condition (III.3.8). In order words,
L (
) is maximized; so the weight vector representing maximum-margin hyper-
plane is recovered by
λ
*
λ
*
and X i :
n
*
=
W
=
λ
y
X
(11)
i
i
i
i
1
So the bias b is computed as below:
*
*
T
b
=
y
W
X
(12)
i
i
The rule for classification in (3) becomes:
f ( X i ) = R = sign( W *T
X i + b* )
(13)
X
f
y
{-1, 1}
Fig. 3 Classification function
It means that whenever we need to determine to which class a new vector X i
belongs, it is only to substitute X i into W *T
X i + b* and check the value of this
expression. If the value is less than or equal - 1 (
1
) then X i belongs to class y i
= -1 . Otherwise, if the value is greater than or equal 1 (
1
) then X i belongs to
class y i = 1 . Hence the function ( W *T
X i + b* ) is called classification function
or classification rule.
The Lagrange multipliers are non-zero when W T
X i + b is equal 1 or -1 ,
vectors X i in this case are considered support vectors they are closest to the
maximum-margin hyper-plane. These vectors lie on parallel hyper-planes. So this
approach is called support vector machine.
Fig. 4 Support vectors
Search WWH ::




Custom Search