Generalization Analysis for Ranking - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

Definition 17.1 For a function class

, the empirical RA is defined as

(

)

E σ sup

σ i g(z i ),

(17.9)

∈ G

where z i ,i

1 ,...,n are i.i.d. random variables, and σ i ,i

1 ,...,n are i.i.d. ran-

dom variables, with probability

of taking a value of

1or

−

Based on the RA theory [ 4 , 5 ], the following generalization bounds have been

derived. Here it is assumed that

∀

∈ X

≤

M , and the scoring function f is

F ={ x → w T x : w ≤ B. }

learned from the linear function class

, for simplicity.

In this case, one has

∀ x ∈ X , | f(x) |≤ BM .

Theorem 17.5 Let

x ,π y ) be the

corresponding listwise loss . Given the training data S ={ ( x (i) ,π (i y ), i =

denote ListNet or ListMLE , and let L

;

1 ,...,n }

∀ f ∈ F ,( x ,π y ) ∈ X

× Y ,L A (f ;

x ,π y ) ∈[

0 , 1

]

, with probability at least 1

− δ

(0 <δ< 1), the following inequality holds :

2log δ

R φ (f ) − R φ (f ) ≤

2 C A (ϕ)N(ϕ) R ( F ) +

sup

(17.10)

∈ F

where

(

) is the RA of the scoring function class ( for the linear scoring function ,

ϕ (x) measures the smoothness

of the transformation function ϕ ; C A (ϕ) is an algorithm-dependent factor .

2 BM

√ n

we have

(

)

≤

); N(ϕ)

sup x ∈[− BM,BM ]

The expressions of N(ϕ) and C A (ϕ) for ListNet and ListMLE, with respect to

three representative transformation functions are listed in Table 17.1 . 1

From Theorem 17.5 , one can see that when the number of training queries n

approaches infinity, the query-level generalization bound will converge to zero at a

rate of O( 1

√ n ) . Furthermore, by comparing the query-level generalization bound for

different listwise ranking algorithms, and with regards to different transformation

functions, one can have the following observations.

•

The query-level generalization bound for ListMLE is much tighter than that for

ListNet, especially when m , the length of the list, is large.

•

The query-level generalization bound for ListMLE decreases monotonously,

while that of ListNet increases monotonously, with respect to m .

•

The linear transformation function is the best choice in terms of the query-level

generalization bound in most cases.

1 The three transformation functions are

Linear Functions: ϕ L (x) = ax + b, x ∈[− BM, BM ]

Exponential Functions: ϕ E (x) = e ax ,x ∈[− BM, BM ]

1 + e − ax ,x ∈[−

Sigmoid Functions: ϕ S (x) =

BM, BM

]

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home