Information Technology Reference
like regression and classification. Since the output space of ranking is much more
complex than those of regression and classification, and there is a hierarchical
structure in the ranking problem (i.e., query-document), the generalization ability
and statistical consistency in ranking should be revisited.
As a summary, we plot the representative algorithms of the three approaches
introduced in this topic in Fig. 19.1 . Due to space restrictions, we have not put al-
gorithms on data preprocessing, advanced ranking tasks, and theoretical work on
ranking in this figure. From the figure, one can find that learning to rank for infor-
mation retrieval has become hotter and hotter in recent years, and more and more
attention has been paid to the listwise approach.
At the end of this topic, let us come back to the questions we raised in the intro-
duction, and provide their possible answers.
(1) In what respect are these learning-to-rank algorithms similar and in which
aspects do they differ? What are the strengths and weaknesses of each algorithm?
The answer can be found by the algorithm description and categorization. Ba-
sically algorithms belonging to the pointwise approach reduce ranking to either
regression, classification, or ordinal regression. Algorithms belonging to the pair-
wise approach reduce ranking to pairwise classification. The advantage of these two
approaches is that many existing theories and tools in machine learning can be di-
rectly applied. However, distinct properties of ranking have not been considered in
their formulations. For example, most evaluation measures for ranking are query
level and position based. However, neither the query information nor the positional
information is visible to the loss functions of these two approaches. The listwise
approach instead treats ranking as a new problem, and defines specific algorithms
for it. It can better leverage the concept of query , and consider the positional infor-
mation in the ranking results when training the ranking model. The problem with
the listwise approach is that it is in general more complex than the pointwise and
pairwise approaches, and a new theoretical foundation is needed.
According to the analysis in Chap. 5, the loss functions of many learning-to-rank
methods, no matter whether pointwise, pairwise, or listwise, are upper bounds of
MAP). Therefore, the minimization of these loss functions
can lead to the minimization of (1
NDCG) and (1
NDCG) and (1
MAP), or the maximization
of NDCG and MAP.
(2) Empirically speaking, which of those many learning-to-rank algorithms per-
form the best?
According to the discussions in Part IV, the LETOR benchmark datasets have
been widely used in the recent literature of learning to rank. Due to the standard data
collection, feature representation, dataset partitioning, and evaluation tools provided
in LETOR, it is possible to perform fair comparisons among different learning-to-
rank methods. Empirical studies on LETOR have shown that the listwise ranking
algorithms seem to have certain advantages over other algorithms, especially for the
top positions of the ranking results, and the pairwise ranking algorithms seem to