Probabilistic Connection between Cross-Validation and Vapnik Bounds - Agents and Artificial Intelligence - page 40

Information Technology Reference

In-Depth Information

→ ∞

Actually, one can easily see that as we take larger samples I

andwesticktothe

1 standing at ln N − ln η

leave-one-out cross-validation n = I , the coefficient n

n −

goes

2 I

to 1, whereas the coefficient √ n standing at − ln η

2 I goes to infinity.

One might ask: for what choice of n each bound is the tightest given

η

, I , N ? Treating

for a moment n as a continuous variable, we impose the conditions:

∂ε U ( η , I , N , n )

∂

∂ε L ( η , I , N , n )

∂

= 0 ,

= 0 ,

n

n

and we get optimal n values:

n U = 1 + √ ln N

+ √ −

3

−

ln

η

ln

η

√ −

,

(36)

ln

η

n L = 1 + 2 √ ln N

3

−

ln

η

√ −

.

(37)

ln

η

Note that these values do not depend on the sample size I .

2.2

Regression Estimation Learning Task

Now we consider the set of real-valued error functions but we still stay with the simplest

case when the set has a finite number of elements. We give theorems for the regression

estimation learning task, analogous to the ones for the classification. We skip proofs —

the only changes they would require is the assumption of the bounded functions, and

the use of Hoeffding inequality in the place of Chernoff inequality.

Theorem 4. Let

} ω j ∈ Ω ,j = 1 , 2 ,..., N, be a finite set of real-valued bounded

functions (regression estimation task) of size N, 0

{

Q ( z ,

ω j )

≤

Q ( z ,

ω j )

≤

B. Then, for any

η

> 0 ,

arbitrarily small, there is a small number

n

k

(

n

k = 1

1 ) k ( 2

) k ,

α

(

η

, n )=

η −

−

η

(38)

and the number

, I , N , n )= 2 n

n

1 + 1 B ln N

−

ln

η

(

ε

η

−

2 I

1 B

+ √ n + n

n

−

ln

η

, (39)

−

2 I

such that:

P

, I , N , n )

|

V

−

C

|≤ ε

(

η

≥

1

− α

(

η

, n ) .

(40)

Next Page

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home