Information Technology Reference
In-Depth Information
I
=
300,
n
=
5
r
i
sks, bounds
V
+
ε
U
1.0
V
0.8
0.6
C
0.4
0.2
R
(
ω
I
)
R
emp
(
ω
I
)
K
0.0
10
30
50
70
90
110
130
150
170
190
h
∗
I
=
400,
n
=
5
risks, bounds
V
+
ε
U
0.6
V
0.4
0.2
C
R
(
ω
I
)
R
emp
(
ω
I
)
K
0.0
10
30
50
70
90
110
130
150
170
190
h
∗
Fig. 5.
SRM experiments. With
I
=
300, optimum points reached at:
h
∗
=
91 (SRM),
h
=
91 (
C
),
h
=
151 (true risk
R
). With
I
=
400, optimum points reached at:
h
∗
=
111 (SRM),
h
=
131 (
C
),
h
=
151 (true risk
R
).
6
Summary
In the paper we take under consideration the probabilistic relationship between two
quantities: Vapnik generalization bound
V
and the result
C
of an
n
-fold non-stratified
cross-validation. In the literature on the subject of machine learning (and SLT) typically
the stated results have a different focus — namely, the relation between the
true risk
(generalization error) and either of the two quantities
V
,
C
separately. The perspective
we chose was intended to:
- stay in the setting of Structural Risk Minimization approach based on Vapnik
bounds,
-
not perform
the cross-validation procedure,
- be able to make probabilistic statements about closeness of SRM results to cross-
validation results (if such was perfomed) for given conditions of learning experi-
ment.