Information Technology Reference
In-Depth Information
k = 1 k (
n
1 ) k ( 2
) k )
(
Now, we plug (33) into (32) and obtain with probability 1
η
η
or greater:
n n R emp (
ln N
ln N
1
ln
η
ln
η
C
ω I )
2 I
2 I
ln
η
2 I
ln N
n
2 n
n
ln
η
ln
η
= R emp (
I )
ω
1
2 I
2 I
2 n
n
1 + 1 ln N
ln
η
= R emp (
ω I )
1
2 I
n
ln
η
2 I
2 n
n
1 + 1 ln N
n
ln
η
ln
η
= V
.
2 I
2 I
This concludes the proof of theorem 3.
Using theorems 2 and 3 we can also say what sample size I is necessary so that the the
difference C
ε .
Let us denote the right-hand-sides of upper and lower bounds (17) and (18) by
V or V
C is less than or equal to an imposed epsilon
ε U
ε U . Solving it for I
and
ε L respectively. Now, suppose we want to have
ε U (
η
, I , N , n )
we get
n
n
1 ln N
1
I
1
ln
η
ε U 2
2
2
+ n + n
n
1
ln
η
(34)
ε L .
Similarly, if we want to have
ε
L (
η
, I , N , n )
2 n
n
2
1 + 1 ln N
+ n
1
I
ln
η
ln
η
(35)
ε L 2
2
To give an example: say we have a finite set of 100 functions, N = 100, we perform
a 5-fold cross-validation, n = 5, and we choose
ε L =
ε U = 0 . 05. Then it
η
= 0 . 1and
follows that we need a sample of size I
5832 so that the cross-validation result is not
worse than V + 0 . 05, whereas we need I
28314 so that the cross-validation result is
0 . 05. And both results are true with probability 1
α
(
, n )
0 . 73
not better than V
η
or greater.
Remark 3. For the leave-one-out cross-validation, where n = I , both the lower and the
upper bound loosen to a constant of order O ln η
2
.
Search WWH ::




Custom Search