Generalization Analysis for Ranking - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

generalization over queries; and with the two-layer ranking framework, one is con-

cerned with the generalization over both queries and documents.

In this chapter, we will introduce existing works on generalization analysis for

ranking, with respect to different types of bounds and different ranking frameworks.

17.2 Uniform Generalization Bounds for Ranking

Several works have been done on the uniform generalization bounds of either the

true loss or the surrogate loss functions in ranking. Some of them fall into the doc-

ument ranking framework, while others fall into the subset ranking framework and

the two-layer ranking framework.

17.2.1 For Document Ranking

In [ 10 ], the uniform generalization ability for RankBoost is discussed in the context

of bipartite document ranking. That is, there are two levels of relevance degrees in

the data and there is no notion of query considered in the analysis. The general-

ization bound is defined with respect to the pairwise 0-1 loss. That is, if a pair of

documents (whose labels are positive and negative respectively) is correctly ranked,

the loss is zero, otherwise it is one. The corresponding empirical and expected risks

are defined as in (16.5) and (16.6) (see Chap. 16).

To perform the analysis, the VC dimension [ 16 , 17 ] is used. According to the

theoretical results obtained in [ 10 ] (see the foll owing the orem), the generalization

bound converges to zero at a rate of O( max

log (m + )

m +

, log (m − )

m −

) , where m +

{

}

and

m −

are the numbers of relevant and irrelevant documents respectively.

F ,

which has a finite VC dimension V , the scoring functions f ( which are the weighted

combinations of the weak rankers ) belong to function class

Theorem 17.1 Assume that all the weak learners belong to function class

. Let S +

and S −

be positive and negative samples of size m +

and m − , respectively . That is , S + =

m +

m −

{ x j }

x j }

and S + ={

1 . Then with probability at least 1

−

δ (0 <δ< 1), the

following inequality holds for

∀ f ∈ F

V ( log 2 m +

log 1 δ

R 0 (f )

R 0 (f ) ≤

1 )

−

m +

V ( log 2 m −

log 1 δ

V +

1 )

(17.1)

m −

where V =

2 (V

1 )(T

1 ) log 2 (e(T

1 )) , T is the number of weak rankers in

RankBoost .

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home