Probabilistic Connection between Cross-Validation and Vapnik Bounds - Agents and Artificial Intelligence - page 48

Information Technology Reference

In-Depth Information

I = 300, n = 5

r i sks, bounds

V +

ε U

1.0

V

0.8

0.6

C

0.4

0.2

R (

ω I )

R emp ( ω I )

K

0.0

10

30

50

70

90

110

130

150

170

190

h ∗

I = 400, n = 5

risks, bounds

V + ε U

0.6

V

0.4

0.2

C

R ( ω I )

R emp (

ω I )

K

0.0

10

30

50

70

90

110

130

150

170

190

h ∗

Fig. 5. SRM experiments. With I = 300, optimum points reached at: h ∗ = 91 (SRM), h = 91 ( C ),

h = 151 (true risk R ). With I = 400, optimum points reached at: h ∗ = 111 (SRM), h = 131 ( C ),

h = 151 (true risk R ).

6

Summary

In the paper we take under consideration the probabilistic relationship between two

quantities: Vapnik generalization bound V and the result C of an n -fold non-stratified

cross-validation. In the literature on the subject of machine learning (and SLT) typically

the stated results have a different focus — namely, the relation between the true risk

(generalization error) and either of the two quantities V , C separately. The perspective

we chose was intended to:

- stay in the setting of Structural Risk Minimization approach based on Vapnik

bounds,

- not perform the cross-validation procedure,

- be able to make probabilistic statements about closeness of SRM results to cross-

validation results (if such was perfomed) for given conditions of learning experi-

ment.

Next Page

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home