Information Technology Reference
In-Depth Information
This is an important remark, because it means that both the cross-validation result C
and the Vapnik bound V converge in probability 8 to the same value 9 as the sample size
grows large. Moreover, the rate of this convergence is exponential.
0 +
Proof ( Proof of Remark 2 ). Since N is fixed, we note that for
η
ln N
ln
η
ln
η
.
2 I
2 I
Therefore, for fixed
η
, N , n there exists a constant, say D , such that
, I , N , n )= 2 n
n
1 + 1 ln N
ln
η
(
ε
η
2 I
+ n + n
n
1
D
η
2 I
ln
ln
η
.
2 I
2 / D 2 ) .
Solving the inequality for
η
we obtain
η
exp (
2 I
ε
Having in mind the inequality (16), we now give two theorems in which the absolute
value sign in
is omitted. They can be viewed as the upper and the lower proba-
bilistic bounds on C and they are derived as tighter bounds than (16). Proving these two
theorems immediately implies proving the theorem 1.
|
V
C
|
α
(
, n ) or greater, the following inequality holds true:
Theorem 2. With probability 1
η
n
n
1 ln N
ln
η
C
V
1
2 I
+ n + n
n
ln
η
. (17)
1
2 I
Theorem 3. With probability 1
α
(
η
, n ) or greater, the following inequality holds true:
2 n
n
1 + 1 ln N
+ n
ln
η
ln
η
V
C
.
(18)
2 I
2 I
The second result is more interesting, provided of course that the bound is positive for
given constants
, I , N , n . Otherwise, we get zero or negative bound, which is trivial.
The fig. 1 illustrates the sense of theorems 2 and 3.
η
We say that A ( I ) converges in probability to B , we write A ( I ) P
8
ε >
0, η > 0, there exists a treshold size of sample I ( ε , η ) , such that for all I I ( ε , η ) : P | A ( I )
B , when for any numbers
−→
I
B | > ε η
.
9
C and V can be viewed as random variables, due to random realizations of data set { z 1 ,..., z I }
with joint density p ( z ) (this affects C and V ) and due to random realizations of subsets in cross-
validation folds (this affects C ). When the data set { z 1 ,..., z I } is fixed, V is fixed too.
Search WWH ::




Custom Search