Information Technology Reference
In-Depth Information
Ta b l e 2 . Collective results — each row encapsulates 10 repetitions. Tasks: c. — classification,
r. e . — regression estimation. We denote experiments on finite or infinite sets of functions by
setting either N or h . For regression estimation we use probabilistic B calculated as R emp ( ω I )+
B ln η B / ( 2 I ) . In all experiments
η = 0 . 2, hence for n = 3 the probability that bounds are
α ( η , n )= 0 . 496 or greater and for n = 5itis1
α ( η , n )= 0 . 511 or greater.
true is 1
observed
range of C
(10 repetitions)
ratio of
C inside
bounds
no.
of
exp. task
bounds
[ V ε L , V + ε U ]
K
N
h
I
R emp ( ω I ) V
n
10 3
1
c.
50
10
-
0 . 412
0 . 456 3 [ 0 . 254 , 0 . 550 ] [ 0 . 351 , 0 . 445 ]
1 . 0
10 3
0 . 345
0 . 389 3 [ 0 . 187 , 0 . 483 ] [ 0 . 352 , 0 . 385 ]
1 . 0
2
c.
200 10
-
10 4
3
c.
200 10
-
0 . 369
0 . 383 3 [ 0 . 319 , 0 . 413 ] [ 0 . 371 , 0 . 383 ]
1 . 0
10 4
4
c.
200 10
-
0 . 396
0 . 410 5 [ 0 . 344 , 0 . 442 ] [ 0 . 386 , 0 . 401 ]
1 . 0
10 4
5
c.
50 100
-
0 . 408
0 . 426 3 [ 0 . 349 , 0 . 456 ] [ 0 . 392 , 0 . 418 ]
1 . 0
10 4
6
c.
200 100
-
0 . 336
0 . 354 3 [ 0 . 277 , 0 . 384 ] [ 0 . 332 , 0 . 338 ]
1 . 0
10 5
0 . 401
0 . 407 3 [ 0 . 383 , 0 . 417 ] [ 0 . 398 , 0 . 403 ]
1 . 0
7
c.
50 100
-
10 5
8
c.
50
-
51
0 . 181
0 . 250 3 [ 0 . 021 , 0 . 267 ] [ 0 . 181 , 0 . 184 ]
1 . 0
10 5
9
c.
200
-
201
0 . 035
0 . 161 3 [ 0 . 25 , 0 . 185 ] [ 0 . 035 , 0 . 037 ]
1 . 0
51
( B = 0 . 193) 10 4
9
r.e.
50
-
0 . 172
0 . 209 3 [ 0 . 078 , 0 . 223 ] [ 0 . 170 , 0 . 173 ]
1 . 0
51
( B = 0 . 194) 10 4
10
r.e.
50
-
0 . 171
0 . 208 5 [ 0 . 085 , 0 . 212 ] [ 0 . 170 , 0 . 172 ]
1 . 0
201
( B = 0 . 020) 10 5
11
r.e. 200
-
0 . 012
0 . 015 3 [ 0 . 006 , 0 . 016 ] [ 0 . 012 , 0 . 013 ]
1 . 0
201
( B = 0 . 020) 10 5
12
r.e. 200
-
0 . 013
0 . 015 5 [ 0 . 007 , 0 . 016 ] [ 0 . 012 , 0 . 013 ]
1 . 0
5
Experiments — SRM
In this section we show results of the Structural Risk Minimization approach. We con-
sider a structure i.e. a sequence of nested subsets of functions: S 1
S 2 ⊂···⊂
S K ,where
each successive S k = f ( x ,
)
ω Ω k is a set of functions with Vapnik-Chervonenkis di-
mension h k , and we have h 1 < h 2 <
ω
< h K . As the best element of the structure
we choose S (with VC dimension h ) for which the bound on generalization V is the
smallest.
Along with observing the bound V , we observe: (1) the cross-validation result C ,
(2) our bounds on C , (3) the actual true risk R calculated as an integral according to
its definition (1). We pay particular attention to how the minimum point of SRM at h
differs from the minimum suggested by the cross-validation and the minimum of true
risk (which normally in practice is unknown). We remind that obtaining the result C for
each h k is O ( n ) times more laborious than obtaining V for each h k .Seefig.5.
···
 
Search WWH ::




Custom Search