Information Technology Reference
In-Depth Information
{
Q ( z ,
)
} ω Ω
Theorem 9. Let
ω
be an infinite set of real-valued bounded functions,
Q (
, z )
0
ω
B, with finite Vapnik-Chervonenkis dimension h. Then, with probabil-
α
(
, n ) or greater, the following inequality holds true:
ity 1
η
h ( 1 + 2 h )
n
n
1 B
ln 4
C
V
1
I
+ n + n
n
ln
η
. (47)
1
2 I
Theorem 10. With probability 1
α
(
η
, n ) or greater, the following inequality holds
true:
h ( 1 + 2 h )
2 n
n
1 + 1 B
ln 4
V
C
I
+ n
ln
η
. (48)
2 I
In practice, bounds (47) and (48) can be significantly tightened by using an estimate B
in the place of the most pessimistic B . The estimate B can be found by performing just
one fold of cross-validation (instead of n folds) and bounding B by: mean error on the
testing set plus a square root implied by the Chernoff inequality:
ω I )+ B
η B
2 I
ln
B
R emp (
,
(49)
where
η B is an imposed small probability that (49) is not true. The reasoning behind this
remark is that in practice, typical learning algorithms rarely produce functions f ( x ,
I ) ,
in the process of ERM, having high maximal errors. Therefore, we can insert the right-
hand-side of (49) into (47) and (48) in the place of B . If this is done, then the minimal
overall probability on bounds (47) and (48) should be adjusted to 1
ω
α ( η , n ) η B .
4
Experiments — Bounds Checks
Results of three experiments are shown in this section, for the following cases: (1) bi-
nary classification, finite set of functions, (2) binary classification, infinite set of func-
tions, (3) regression estimation, infinite set of functions.
4.1 Set of Functions
The form of f functions, f : [ 0 , 1 ] 2
[
1 , 1 ] , was Gaussian-like:
f ( x , w 0 , w 1 ,..., w K
ω
)=
k = 1 w k exp x µ k
σ k 2 (50)
max
1 , min 1 , w 0 +
K
2
2
Search WWH ::




Custom Search