Information Technology Reference
In-Depth Information
Chapter 4
The Listwise Approach
Abstract In this chapter, we introduce the listwise approach to learning to rank.
Specifically, we first introduce those listwise ranking algorithms that minimize
measure-specific loss functions (also referred to as direct optimization methods),
and then introduce some other algorithms whose loss functions are not directly re-
lated to evaluation measures for information retrieval.
4.1 Overview
The listwise approach takes the entire set of documents associated with a query
in the training data as the input and predicts their ground truth labels. According
to the loss functions used, the approach can be divided into two sub-categories.
For the first sub-category, the loss function is explicitly related to evaluation mea-
sures (e.g., approximation or upper bound of the measure-based ranking errors).
Example algorithms include [ 5 , 7 , 25 , 27 , 32 - 34 , 36 , 37 ]. Due to the strong rela-
tionship between the loss functions and the evaluation measures, these algorithms
are also referred to as direct optimization methods. For the second sub-category,
the loss function is not explicitly related to the evaluation measures. Example al-
gorithms include [ 4 , 16 , 30 , 31 ]. In the rest of this chapter, we will introduce both
sub-categories and their representative algorithms.
Note that the listwise approach assumes that the ground truth labels are given in
terms of permutations, while the judgments might be in other forms (e.g., relevance
degrees or pairwise preferences). In this case, we will use the concept of equivalent
permutation set (denoted as Ω y , see Chap. 1) to bridge the gap. With Ω y ,weuse
L(f
x y ) to represent the loss function for the listwise approach. However, for
ease of discussion, we sometimes still write the loss function as L(f
;
;
x , y ) if Ω y is
generated from the relevance degree of each single document.
Furthermore, evaluation measures such as MAP and NDCG can also be rewritten
in the form of Ω y . For example, we can rewrite NDCG as follows:
m
G(l π 1 y (j)) π y (j ) ,
1
Z m
NDCG (π, Ω y )
=
π y
Ω y .
(4.1)
j =
1
Search WWH ::




Custom Search