Information Technology Reference
In-Depth Information
If the judgment is given as the total order π l , one can straightforwardly define
π y =
π l .
Note that for the listwise approach, the output space that facilitates the learning
process is exactly the same as the output space of the task. In this regard, the theoret-
ical analysis on the listwise approach can have a more direct value to understanding
the real ranking problem than the other approaches where there are mismatches be-
tween the output space that facilitates learning and the real output space of the task.
The hypothesis space contains multi-variate functions h that operate on a set of
documents and predict their permutation. For practical reasons, the hypothesis h
is usually implemented with a scoring function f , e.g., h( x )
f( x ) . That
is, first a scoring function f is used to give a score to each document, and then
these documents are sorted in descending order of the scores to produce the desired
permutation.
There are two types of loss functions , widely used in the listwise approach. For
the first type, the loss function is explicitly related to the evaluation measures (which
we call the measure-specific loss function), while for the second type, the loss func-
tion is not (which we call the non-measure-specific loss function). Note that some-
times it is not very easy to determine whether a loss function is listwise, since some
building blocks of a listwise loss may also seem to be pointwise or pairwise. In
this topic, we mainly distinguish a listwise loss from a pointwise or pairwise loss
according to the following criteria:
=
sort
A listwise loss function is defined with respect to all the training documents as-
sociated with a query.
A listwise loss function cannot be fully decomposed to simple summation over
individual documents or document pairs.
A listwise loss function emphasizes the concept of a ranked list, and the positions
of the documents in the final ranking result are visible to the loss function.
Just because of these properties of the loss function, the listwise approach is in
more accordance with the ranking task in information retrieval than the pointwise
and pairwise approaches.
Example algorithms that belong to the listwise approach include [ 10 - 12 , 38 , 59 ,
72 , 80 , 82 , 84 , 85 , 88 , 89 , 96 ]. We will introduce some of them in Chap. 4.
To sum up, we list the key components for each approach to learning to rank in
Table 1.2 . In this table, for simplicity, we assume the use of a scoring function f in
the hypothesis of all the approaches, although it is not always necessary to be the
case.
It should be noted that although the scoring function is a kind of “pointwise”
function, it is not to say that all the approaches are in nature pointwise approaches.
The categorization of the aforementioned three approaches is mainly based on the
four pillars of machine learning. That is, different approaches regard the same train-
ing data as in different input and output spaces, and use different loss functions and
hypotheses. Accordingly, they will have difference theoretical properties. We will
make more discussions on this in Part VI, with the introduction of a new theory
called the statistical learning theory for ranking .
Search WWH ::




Custom Search