Information Technology Reference
In-Depth Information
{
Q
(
z
,
)
}
ω
∈
Ω
Theorem 9.
Let
ω
be an infinite set of real-valued bounded functions,
≤
Q
(
,
z
)
≤
0
ω
B, with finite Vapnik-Chervonenkis dimension h. Then, with probabil-
−
α
(
,
n
)
or greater, the following inequality holds true:
ity
1
η
h
(
1
+
2
h
)
n
n
1
B
ln
4
−
C
−
V
≤
1
−
−
I
+
√
n
+
n
n
−
ln
η
.
(47)
−
1
2
I
Theorem 10.
With probability
1
−
α
(
η
,
n
)
or greater, the following inequality holds
true:
h
(
1
+
2
h
)
2
n
n
1
+
1
B
ln
4
−
V
−
C
≤
−
I
+
√
n
−
ln
η
.
(48)
2
I
In practice, bounds (47) and (48) can be significantly tightened by using an estimate
B
in the place of the most pessimistic
B
. The estimate
B
can be found by performing just
one fold of cross-validation (instead of
n
folds) and bounding
B
by: mean error on the
testing set plus a square root implied by the Chernoff inequality:
ω
I
)+
B
−
η
B
2
I
ln
B
R
emp
(
≤
,
(49)
where
η
B
is an imposed small probability that (49) is not true. The reasoning behind this
remark is that in practice, typical learning algorithms rarely produce functions
f
(
x
,
I
)
,
in the process of ERM, having high maximal errors. Therefore, we can insert the right-
hand-side of (49) into (47) and (48) in the place of
B
. If this is done, then the minimal
overall probability on bounds (47) and (48) should be adjusted to 1
ω
−
α
(
η
,
n
)
−
η
B
.
4
Experiments — Bounds Checks
Results of three experiments are shown in this section, for the following cases: (1) bi-
nary classification, finite set of functions, (2) binary classification, infinite set of func-
tions, (3) regression estimation, infinite set of functions.
4.1 Set of Functions
The form of
f
functions,
f
:
[
0
,
1
]
2
→
[
−
1
,
1
]
, was Gaussian-like:
f
(
x
,
w
0
,
w
1
,...,
w
K
ω
)=
k
=
1
w
k
exp
−
x
−
µ
k
σ
k
2
(50)
max
−
1
,
min
1
,
w
0
+
K
2
2