Machine Learning - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

∀

−

Theorem 22.5

0 <δ< 1, with probability at least 1

δ ,

log δ

2 n

≤ R(g)

R(g)

(

)

and also with probability at least 1

−

δ ,

2log δ

≤ R(g)

R(g)

R n (

)

Similar to the covering number, it is not difficult to compute the upper bounds for

the Rademacher avera ge in ma ny cases. For example, for a finite set with

| G |=

N ,

2 √ l og N /n . And for a function class whose VC dimension is h ,

R n (

≤

we have

)

C √ h/n .

we have

R n (

)

≤

References

1. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)

2. Bousquet, O., Boucheron, S., Lugosi, G.: Introduction to statistical learning theory. In: Ad-

vanced Lectures on Machine Learning, pp. 169-207. Springer, Berlin (2004)

3. Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)

4. Ng, A.: Lecture notes for machine learning. Stanford University cs229 (2010). http://see.

stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1s

5. Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, New York (1998)