Information Technology Reference
In-Depth Information
In order to further bound the term log S G ( 2 n) , the concept of VC dimension has
been introduced and discussed for typical function classes.
Definition 22.2 The VC dimension of a class
G
is the largest n such that S G (n)
=
2 n .
In other words, the VC dimension is the size of the largest set that the function
class
can shatter. With the concept of the VC dimension, the following theorem
has been proven, which gives an upper bound for the growth function.
G
Theorem 22.3 Let
G
be a function class whose VC dimension is h . Then
n ,
h
C n ,
S
(n)
G
i =
0
and
n h ,
en
h
h
S G (n)
.
Based on the above theorem, we can obtain the uniform bound for the infinite
case, as shown below.
0 <δ< 1, with probability at least 1
δ ,
2 h log 2 e h
log δ
+
g G ,R g) R(g) +
2
.
n
In addition to the VC dimension introduced in the previous subsection, there are
also several other measures which may not only give tighter bounds, but also have
properties that make their computations possible from the data only. Specifically we
will introduce the covering number and Rademacher average, as well as the bounds
that they can lead to in the next subsections.
22.3.2.4 Uniform Bounds Based on Covering Number
We first define the normalized Hamming distance of the projected functions on the
sample as follows:
n f(z i )
1 ,...,n .
1
d n (f, f )
f (z i )
=
=
:
i
=
Then given such a distance, we say that a set of functions f 1 ,...,f N can cover
F
at radius ε if
B(f i ,ε)
where B(f i ,ε) is the ball whose center is f i and radius measured by d n is ε .
Correspondingly, we define the covering number of
F
F
as follows.
Search WWH ::




Custom Search