Information Technology Reference
InDepth Information
Let
D
eval
beatestdatasetand
K
=
{
w

decide
(
w
)=
P
(
w
)
,w
∈
D
eval
}
be the number of correctly classified elements in
D
eval
. To evaluate the perfor
mance of a given pattern classification system we use a simple accuracy measure:
K
A
=
,

D
eval

which gives the fraction of the correctly classified elements from the test set
D
eval
.
Notice, that the numbers of correctly classified elements follow the Binomial
distribution and
A
can be viewed as an estimate of probability
p
of a word being
classified correctly.
We are interested in constructing a confidence interval for probability
p
.For
binomial variates, exact confidence intervals do not exist in general. One can
obtain an approximate 100(1
α
)% confidence interval [
p
S
,p
L
]bysolvingthe
following equations for
p
S
and
p
L
:
−

p
i
S
(1
K
D
eval

i
p
S
)
D
eval
−i
=
α/
2
,
−
i
=0

p
i
L
(1
D
eval

D
eval

i
p
L
)
D
eval
−i
=
α/
2
−
i
=
K
for a given
α
.
Exact solutions to the equations above can be obtained by reexpressing in
terms of the incomplete beta function (see [1] for details).
4.3
Results of Experiments
Experiments were repeated with the following types of kernel functions:
K
1
: linear;
K
2
: quadratic (
b
x
i
·
x
j
+
c
)
2
;
K
3
: polynomial of degree 3 (
b
x
i
·
x
j
+
c
)
3
;
K
4
: polynomial of degree 4 (
b
x
i
·
x
j
+
c
)
4
;
K
e
: Gaussian
e
−γ
x
i
−
x
j

2
,
where
x
i
,
x
j
are the feature vectors obtained with mapping
f
WG
and
x
i
·
x
j
is
the inner product of
x
i
and
x
j
.
The results of the experiments presented in Table 2. It shows that SVM with
appropriate kernel perform well not only on free groups of small ranks but on
groups of large ranks as well. The experiments confirmed observations, made
previously, that classes of minimal and nonminimal words are not linearly sepa
rable. Moreover, once the rank and, therefore dimensionality of the feature space
Search WWH ::
Custom Search