Information Technology Reference
the listwise approach takes the entire set of documents associated with a query as
the input instance and their ranked list (or their relevance degrees) as the output. In
this way, it has the potential to distinguish documents from different queries, and to
consider the positional information in the output ranked list (although not all list-
wise ranking methods have fully utilized this information) in its learning process.
According to some previous studies [ 31 ], the performances of the listwise ranking
algorithms are in general better than the pointwise or pairwise ranking algorithms.
This is also verified by the discussions in Chap. 11, which is about the empiri-
cal ranking performances of different learning-to-rank algorithms, with the LETOR
benchmark datasets as the experimental platform.
On the other hand, the listwise approach also has certain aspects to improve.
For example, the training complexities of some listwise ranking algorithms (e.g.,
ListNet and BoltzRank) are high since the evaluation of their loss functions are
permutation based. A more efficient learning algorithm is needed to make the list-
wise approach more practical. Moreover, the positional information has not been
fully utilized in some listwise ranking algorithms, although it is visible to their loss
functions. For example, there is no explicit position discount considered in the loss
functions of ListNet and ListMLE. By introducing certain position discount factors,
performance improvement of these algorithms can be expected.
4.1 Derive the gradient of SoftNDCG with respect to the parameter w of the linear
4.2 Actually SoftRank and Approximate Rank have strong connections with each
other. In other words, SoftRank also leads to a kind of sigmoid approximation
of the rank position. Prove this connection and show what kind of sigmoid
function SoftRank uses implicitly.
4.3 Explain why the strategy used in SVM map can find the most violated constraint,
and analyze its time complexity.
4.4 Prove that under certain conditions the loss functions in ListNet and ListMLE
4.5 Suppose there are two features for learning to rank, please use a synthetic ex-
ample to show the surface of NDCG in this case. Please further plot the surface
of SoftNDCG and AppNDCG, and study their approximation to NDCG.
4.6 Please list the permutation probability models that can be used for ranking,
except the Plackett-Luce model which has been well introduced in this chapter.
4.7 Show the possible ways of introducing position-based weights to ListMLE
and ListNet, and compare their performances using the LETOR benchmark
4.8 Suppose the true loss for ranking is the permutation-level 0-1 loss. Prove that
under certain assumptions on the permutation probability space, the optimal
ranker given by ListMLE can also minimize the permutation-level 0-1 loss.