Generalization Analysis for Ranking - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

−

Theorem 17.3 With probability at least 1

δ (0 <δ< 1), the following inequality

holds :

V( F )

ln δ

m −

R 0 (f ) − R 0 (f ) ≤

4 c

1 ,

(17.5)

F ={

, and V( F

where

sgn (f (x u )

−

f(x v ))

;

∈ F }

) is its VC dimension ; c is a

universal constant .

As compared to the results in [ 10 ], the bound as shown in Theorem 17.3 seems

tighter, although it is also based on the VC dimension. This is mainly due to the

consideration of the properties of U-statistics in the analysis.

In addition to the discussions on the pairwise 0-1 loss, in [ 9 ], the generaliza-

tion bounds with respect to surrogate loss functions are also discussed. Suppose the

surrogate loss function is φ , then the expected and empirical surrogate risks can be

defined as below.

R φ (f ) = E φ −

sgn (y u − y v ) · f(x u ) − f(x v ) ,

(17.6)

φ −

· f(x u )

f(x v ) .

m(m

R φ (f )

sgn (y u −

y v )

−

(17.7)

−

1 )

u =

v = u +

The corresponding generalization bound is shown in the following theorem.

Theorem 17.4 With probability at least 1

− δ( 0 <δ< 1 ) , the following inequality

holds :

B 2 ln δ

2 m

R φ (f )

− R φ (f ) ≤

√ m +

4 cBφ (B)

(17.8)

F ={

, and V( F

−

;

∈ F }

Here

) is its VC dimension ; c is a

universal constant ; the output of function f is uniformly bounded by

sgn (f (x u )

f(x v ))

−

B and B .

17.2.2 For Subset Ranking

In [ 13 ], the uniform bounds for the listwise surrogate loss functions are obtained in

the setting of subset ranking. Since queries are considered i.i.d. random variables

in subset ranking, the corresponding generalization bound is also referred to as the

query-level generalization bound. Under the subset ranking framework, the expected

and empirical risks are defined as in (16.13) and (16.14) (see Chap. 16).

Technically, the theory of the Rademacher average (RA) [ 4 , 5 ]isusedinthe

analysis. RA measures how much the function class can fit random noise, which is

defined below.

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home