Information Technology Reference
In-Depth Information
*
λ
in (10) is quadratic programming (QP) problem.
There are many approaches to sole this problem; for instance, the sequential
minimal optimization is described as below:
However the way to find
−
A QP with two variables is trivial to solve
λ
,
λ
−
At each iteration, a pair of (
) is picked up and QP is solved with these
i
j
variables; repeat until convergence
2.2 Applying Support Vector Machine into Document
Classification
Give corpus
D
= {
D
1
, D
2
, D
3
,…, D
m
} in which every document
D
i
is modeled by a
tf-idf
weight vector. Suppose there are
n
terms {
t
1
, t
2
,…, t
n
}, we have:
D
i
= (
d
i1
, d
i2
,…, d
in
)
where
d
ij
is product of term frequency and inverse document frequency
d
ij
=
tf
ij
*
idf
j
.
If the index of document is ignored, document
D
is represented as below:
D
= (
d
1
, d
2
,…, d
n
)
Given
k
classes {
l
1
, l
2
,…, l
k
}, there is demand of classifying documents into such
classes. The technique of classification based on SVM is
two-class
classification
in which the classes is +1, -1 for
y
i
=+1, -1
respectively. So we need to determine
unique hyper-plane referred to as
two-class
classifier. It is possible to extend
two-
class
classification to
k-class
classification by constructing
k
two-class
classifier.
In means that we must specify
k
couple of optimal weight vector
W
i
*
and bias
b
i
*
.
Each couple (
W
i
*
, b
i
*
) being a
two-class
classifier is the representation of class
l
i
.
The process of finding (
W
i
*
, b
i
*
) in training corpus
D
is described in the section of
support vector machine.
Table 1
k
couple (
W
i
*
, b
i
*
) corresponds with
k
class {
l
1
, l
2
,…, l
k
}
Class
Weight vector
Bias
Classification rule
W
1
*
b
1
*
R
1
= W
1
*T
⊗
l
1
X + b
1
*
W
2
*
b
2
*
⊗
l
2
R
2
= W
2
*T
X + b
2
*
…
…
…
…
W
k
*
b
k
*
R
k
= W
k
*T
⊗
l
k
X + b
k
*
For example, classifying document
D=
(
d
1
, d
2
,…, d
n
) is described as below:
1.
For each classification rule
R
i
=
W
i
*T
⊗
X + b
i
*
, substituting each D into such
rule. It means that vector
X
in such rule is replaced by document
D
.
Expression
i
=
W
i
*T
⊗
D + b
i
*
2.
Suppose there is a sub-set of rules {
R
h+1
, R
h+2
,…, R
h+r
} that the value of
expression
Expression
i
=
W
i
*T
⊗
D + b
i
*
is greater than or equal
1
. We can