Information Technology Reference
In-Depth Information
Theorem 22.5
0 <δ< 1, with probability at least 1
δ ,
log δ
2 n
R(g)
R(g)
+
2
R
(
G
)
+
and also with probability at least 1
δ ,
2log δ
n
R(g)
R(g)
+
2
R n (
G
)
+
.
Similar to the covering number, it is not difficult to compute the upper bounds for
the Rademacher avera ge in ma ny cases. For example, for a finite set with
| G |=
N ,
2 l og N /n . And for a function class whose VC dimension is h ,
R n (
G
we have
)
C h/n .
we have
R n (
G
)
References
1. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)
2. Bousquet, O., Boucheron, S., Lugosi, G.: Introduction to statistical learning theory. In: Ad-
vanced Lectures on Machine Learning, pp. 169-207. Springer, Berlin (2004)
3. Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
4. Ng, A.: Lecture notes for machine learning. Stanford University cs229 (2010). http://see.
stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1s
5. Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, New York (1998)
 
Search WWH ::




Custom Search