Information Technology Reference
In-Depth Information
Ta b l e 2 .
Collective results — each row encapsulates 10 repetitions. Tasks:
c.
— classification,
r. e .
— regression estimation. We denote experiments on finite or infinite sets of functions by
setting either
N
or
h
. For regression estimation we use probabilistic
B
calculated as
R
emp
(
ω
I
)+
B
−
ln η
B
/
(
2
I
)
. In all experiments
η
=
0
.
2, hence for
n
=
3 the probability that bounds are
−
α
(
η
,
n
)=
0
.
496 or greater and for
n
=
5itis1
−
α
(
η
,
n
)=
0
.
511 or greater.
true is 1
observed
range of
C
(10 repetitions)
ratio of
C
inside
bounds
no.
of
exp. task
bounds
[
V
−
ε
L
,
V
+
ε
U
]
K
N
h
I
R
emp
(
ω
I
)
V
n
10
3
1
c.
50
10
-
0
.
412
0
.
456 3
[
0
.
254
,
0
.
550
] [
0
.
351
,
0
.
445
]
1
.
0
10
3
0
.
345
0
.
389 3
[
0
.
187
,
0
.
483
] [
0
.
352
,
0
.
385
]
1
.
0
2
c.
200 10
-
10
4
3
c.
200 10
-
0
.
369
0
.
383 3
[
0
.
319
,
0
.
413
] [
0
.
371
,
0
.
383
]
1
.
0
10
4
4
c.
200 10
-
0
.
396
0
.
410 5
[
0
.
344
,
0
.
442
] [
0
.
386
,
0
.
401
]
1
.
0
10
4
5
c.
50 100
-
0
.
408
0
.
426 3
[
0
.
349
,
0
.
456
] [
0
.
392
,
0
.
418
]
1
.
0
10
4
6
c.
200 100
-
0
.
336
0
.
354 3
[
0
.
277
,
0
.
384
] [
0
.
332
,
0
.
338
]
1
.
0
10
5
0
.
401
0
.
407 3
[
0
.
383
,
0
.
417
] [
0
.
398
,
0
.
403
]
1
.
0
7
c.
50 100
-
10
5
8
c.
50
-
51
0
.
181
0
.
250 3
[
0
.
021
,
0
.
267
] [
0
.
181
,
0
.
184
]
1
.
0
10
5
9
c.
200
-
201
0
.
035
0
.
161 3
[
−
0
.
25
,
0
.
185
] [
0
.
035
,
0
.
037
]
1
.
0
51
(
B
=
0
.
193) 10
4
9
r.e.
50
-
0
.
172
0
.
209 3
[
0
.
078
,
0
.
223
] [
0
.
170
,
0
.
173
]
1
.
0
51
(
B
=
0
.
194) 10
4
10
r.e.
50
-
0
.
171
0
.
208 5
[
0
.
085
,
0
.
212
] [
0
.
170
,
0
.
172
]
1
.
0
201
(
B
=
0
.
020) 10
5
11
r.e. 200
-
0
.
012
0
.
015 3
[
0
.
006
,
0
.
016
] [
0
.
012
,
0
.
013
]
1
.
0
201
(
B
=
0
.
020) 10
5
12
r.e. 200
-
0
.
013
0
.
015 5
[
0
.
007
,
0
.
016
] [
0
.
012
,
0
.
013
]
1
.
0
5
Experiments — SRM
In this section we show results of the
Structural Risk Minimization
approach. We con-
sider a
structure
i.e. a sequence of nested subsets of functions:
S
1
⊂
S
2
⊂···⊂
S
K
,where
each successive
S
k
=
f
(
x
,
)
ω
∈
Ω
k
is a set of functions with Vapnik-Chervonenkis di-
mension
h
k
, and we have
h
1
<
h
2
<
ω
<
h
K
. As the best element of the structure
we choose
S
∗
(with VC dimension
h
∗
) for which the bound on generalization
V
is the
smallest.
Along with observing the bound
V
, we observe: (1) the cross-validation result
C
,
(2) our bounds on
C
, (3) the actual
true risk R
calculated as an integral according to
its definition (1). We pay particular attention to how the minimum point of SRM at
h
∗
differs from the minimum suggested by the cross-validation and the minimum of true
risk (which normally in practice is unknown). We remind that obtaining the result
C
for
each
h
k
is
O
(
n
)
times more laborious than obtaining
V
for each
h
k
.Seefig.5.
···