The Pairwise Approach - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

where P denotes the induced distribution, defined by drawing an importance

weighted pair from S which is randomly drawn from an underlying distribution P .

In other words, if we can minimize the regret of the classifier in the training

process, correspondingly, we will also minimize the regret of the ranking model,

and get NDCG effectively maximized.

Note that ω(y u ,y v ,c) , the weight in the loss function ( 3.32 ), is a function of

the classifier c . However, we actually learn this classifier by minimizing the loss

function. Therefore, there is a chicken and egg problem in this process. To solve

the problem, an online learning method is proposed. That is, one first sets an initial

value c 0 for the classifier, and derives the way of updating the classifier (we denote

the updated classifier as c 1 ) by minimizing the loss. After that, the loss function is

updated by substituting c 1 into the weight. Then, the classifier is further updated

to be c 2 by minimizing the updated loss function. This process is iterated until the

process converges.

When applying the above idea to solve real ranking problems, a L 1 regulariza-

tion term is added to the loss function, in order to guarantee the sparse solution of

the parameter of the ranking model. According to the experimental results reported

in [ 31 ], this robust sparse ranker can outperform many other ranking methods, in-

cluding LambdaRank [ 7 ].

3.4 Summary

In this chapter, we have introduced several pairwise ranking methods, and shown

how to improve their performances by balancing the distribution of document pairs

across queries, emphasizing likely top-ranked documents, etc. Empirical studies

have shown that such modifications really lead to performance gains.

In the next chapter, we will introduce the listwise approach, which also tries to

solve the four problems with the pairwise approach, probably in a more natural and

fundamental manner.

3.5 Exercises

3.1 What are the advantages of the pairwise approach as compared to the pointwise

approach?

3.2 Study the relationship among the loss functions used in different pairwise rank-

ing algorithms, and their relationships to the pairwise 0-1 loss.

3.3 Study the mathematical properties of the loss functions used in different pair-

wise ranking algorithms, such as the convexity.

3.4 Study the inter-dependency among document pairs, and discuss the potential

issue of using the pairwise classification methods to perform learning to rank

on such data.

3.5 Provide an intuitive explanation of the margin in Ranking SVM.

Search WWH ::

Custom Search

Home