Probabilistic Connection between Cross-Validation and Vapnik Bounds - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

k = 1 R emp ( ω I , k ) .

C = 1

(12)

The independence of folds can be formally expressed in the following way. For any two

indices of folds k

= l and for any numbers A , B :

P ( R emp (

ω I , k )= A , R emp (

ω I , l )= B )

= P ( R emp (

P ( R emp (

ω I , k )= A )

ω I , l )= B ) .

We stress the independence once again, because later on we are going to sum up several

independent probabilistic inequalities into one inequality, and we would like the result

to be true with the effective probability being the product of component probabilities.

The Relationship for a Finite Set of Approximating Functions

2.1

Classification Learning Task

Similarly to Vapnik, let us start with the classification learning task and the simplest

case of a finite set of N indicator functions:

{

Q ( z ,

ω j )

} ω j ∈ Ω , j = 1 , 2 ,..., N .Notto

complicate things, we will keep on writing

I in the sense of the optimal function

minimizing the empirical risk on our finite sample of size I , instead of writing more

formally e.g.

ω j ( I ) 7 .

Vapnik shows [1,2] that with probability at least 1

− η

, the following bound on the

true risk is satisfied:

+ ln N

i = 1 Q ( z i , ω I )

R emp ( ω I )

−

Q ( z ,

I ) dF ( z )

≤

(13)

2 I

R (

ω I )

The argument is the following:

P sup

R (

ω j )

−

R emp (

ω j )

≥ ε

≤

j = 1 P R ( ω j ) − R emp ( ω j ) ≥ ε ≤ N · exp ( − 2ε

2 I ) .

≤

The last pass is true, since for each term in the sum Chernoff inequality is satisfied. By

substituting the right-hand-side with small probability

and solving for

, one obtains

the bound:

ln N

−

R (

j )

−

R emp (

j )

≤

2 I

In the sense that j ( I ) ∈{ 1 ,..., N } returns the index of the minimizer given our data set of size

I .

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home