Information Technology Reference

In-Depth Information

the listwise approach takes the entire set of documents associated with a query as

the input instance and their ranked list (or their relevance degrees) as the output. In

this way, it has the potential to distinguish documents from different queries, and to

consider the positional information in the output ranked list (although not all list-

wise ranking methods have fully utilized this information) in its learning process.

According to some previous studies [
31
], the performances of the listwise ranking

algorithms are in general better than the pointwise or pairwise ranking algorithms.

This is also verified by the discussions in Chap. 11, which is about the empiri-

cal ranking performances of different learning-to-rank algorithms, with the LETOR

benchmark datasets as the experimental platform.

On the other hand, the listwise approach also has certain aspects to improve.

For example, the training complexities of some listwise ranking algorithms (e.g.,

ListNet and BoltzRank) are high since the evaluation of their loss functions are

permutation based. A more efficient learning algorithm is needed to make the list-

wise approach more practical. Moreover, the positional information has not been

fully utilized in some listwise ranking algorithms, although it is visible to their loss

functions. For example, there is no explicit position discount considered in the loss

functions of ListNet and ListMLE. By introducing certain position discount factors,

performance improvement of these algorithms can be expected.

4.5 Exercises

4.1 Derive the gradient of SoftNDCG with respect to the parameter
w
of the linear

ranking model.

4.2 Actually SoftRank and Approximate Rank have strong connections with each

other. In other words, SoftRank also leads to a kind of sigmoid approximation

of the rank position. Prove this connection and show what kind of sigmoid

function SoftRank uses implicitly.

4.3 Explain why the strategy used in SVM
map
can find the most violated constraint,

and analyze its time complexity.

4.4 Prove that under certain conditions the loss functions in ListNet and ListMLE

are convex.

4.5 Suppose there are two features for learning to rank, please use a synthetic ex-

ample to show the surface of NDCG in this case. Please further plot the surface

of SoftNDCG and AppNDCG, and study their approximation to NDCG.

4.6 Please list the permutation probability models that can be used for ranking,

except the Plackett-Luce model which has been well introduced in this chapter.

4.7 Show the possible ways of introducing position-based weights to ListMLE

and ListNet, and compare their performances using the LETOR benchmark

datasets.

4.8 Suppose the true loss for ranking is the permutation-level 0-1 loss. Prove that

under certain assumptions on the permutation probability space, the optimal

ranker given by ListMLE can also minimize the permutation-level 0-1 loss.