Information Technology Reference
In-Depth Information
∈[
]
Theorem 22.1 Let Z 1 ,...,Z n be n i . i . d . random variables with f(Z)
a,b
.
Then
ε> 0, we have
2exp
.
n
f(Z i ) E f(Z)
2 2
(b
1
n
P
a) 2
i
=
1
If we denote the right-hand side of the Hoeffding's Inequality as δ , we can have
the following equivalent form of it: with probability at least 1
δ ,
log δ
2 n
n
E f(Z)
1
n
f(Z i )
(b
a)
.
i =
1
. Further considering the relationship between f and g ,
we have with probability at least 1
In our case, f(Z) ∈[
0 , 1
]
δ ,
log δ
2 n
R(g)
R(g)
+
.
The above bound looks quite nice, however, it should be noted that the bound
also has its limitations. The major issue lies in that Hoeffding's Inequality holds for
afixedfunction f , and the probability is with regards to the sampling of the data.
That is, given a function f , it is of large probability that there is a set of samples
satisfying the inequality. However, these sets can be different for different functions.
As a result, for a fixed sample, only some of the functions in
F
will satisfy the
inequality.
To tackle the aforementioned problem, one choice is to consider the supremum
of all the functions and derive a looser bound. We will make more introductions on
this in the next subsection.
22.3.2.2 Uniform Bounds in Finite Case
In order to obtain a uniform bound which is not data dependent, we need to consider
the failure case for each function in the class. For ease of discussion, here we assume
that there are a finite number of functions in the class, and denote the number as N .
Then, for f k (k
1 ,...,N) ,weuse C i to denote the set of “bad” samples, i.e.,
those for which the bound given by Hoeffding's inequality fails. That is,
=
z
.
n
E f k (Z)
1
n
C k =
:
f k (Z i )
i =
1
δ . Then according to the properties of probabilities of
union sets, we can obtain that P(C 1 ∪···∪ C N ) . As a result, we have
It is clear that P(C k )
Search WWH ::




Custom Search