Information Technology Reference
In-Depth Information
I = 300, n = 5
r i sks, bounds
V +
ε U
1.0
V
0.8
0.6
C
0.4
0.2
R (
ω I )
R emp ( ω I )
K
0.0
10
30
50
70
90
110
130
150
170
190
h
I = 400, n = 5
risks, bounds
V + ε U
0.6
V
0.4
0.2
C
R ( ω I )
R emp (
ω I )
K
0.0
10
30
50
70
90
110
130
150
170
190
h
Fig. 5. SRM experiments. With I = 300, optimum points reached at: h = 91 (SRM), h = 91 ( C ),
h = 151 (true risk R ). With I = 400, optimum points reached at: h = 111 (SRM), h = 131 ( C ),
h = 151 (true risk R ).
6
Summary
In the paper we take under consideration the probabilistic relationship between two
quantities: Vapnik generalization bound V and the result C of an n -fold non-stratified
cross-validation. In the literature on the subject of machine learning (and SLT) typically
the stated results have a different focus — namely, the relation between the true risk
(generalization error) and either of the two quantities V , C separately. The perspective
we chose was intended to:
- stay in the setting of Structural Risk Minimization approach based on Vapnik
bounds,
- not perform the cross-validation procedure,
- be able to make probabilistic statements about closeness of SRM results to cross-
validation results (if such was perfomed) for given conditions of learning experi-
ment.
Search WWH ::




Custom Search