Information Technology Reference
In-Depth Information
P
n
E f k (Z)
1
n
f
∈{
f 1 ,...,f N }:
f k (Z i )
i =
1
NP
n
E f k (Z)
N exp
2 2 .
1
n
f k (Z i )
i
=
1
Equivalently, we have
0 <δ< 1, with probability at least 1
δ ,
log 2 δ
2 n
R(g)
g
∈{
g 1 ,...,g N }
,R(g)
+
.
22.3.2.3 Uniform Bounds in Infinite Case
While the bound introduced in the previous subsection can successfully handle the
finite function class, when the number of functions approaches infinity, it becomes
unclear whether the bound can converge. To better bound the difference between
empirical and expected risks in the infinite case, one needs to adopt some new con-
cepts, such as the VC dimension.
When the function class is uncountable (when it is countable, a trivial extension
of the results presented in the previous subsection can be directly applied and there
is no need to introduce the VC dimension), we look at the function class projected
on the sample. Specifically, given a sample z 1 ,...,z n , we consider
f(z 1 ),...,f(z n )
.
F z 1 ,...,z n =
:
f
F
The size of this set is always finite, since function f can only take binary values. An
intuitive explanation on the size of the set is the number of possible ways in which
the data z 1 ,...,z n can be classified. Based on the set, we can define the concept of
a growth function as follows.
Definition 22.1 The growth function is the maximum number of ways into which
n data samples can be classified by the function class:
S
(n) =
z 1 ,...,z n | F z 1 ,...,z n | .
sup
F
It has been proven that if we define the growth function for
G
in the same way,
we will have S F (n)
S G (n) .
Given the concept of the growth function, the following result has been obtained:
=
Theorem 22.2
0 <δ< 1, with probability at least 1
δ ,
log δ
2 log S G ( 2 n) +
R(g)
g
∈{
g 1 ,...,g N }
,R g)
+
2
.
n
Search WWH ::




Custom Search