Introduction - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

•

If the judgment is given as the total order π l , one can straightforwardly define

π y =

π l .

Note that for the listwise approach, the output space that facilitates the learning

process is exactly the same as the output space of the task. In this regard, the theoret-

ical analysis on the listwise approach can have a more direct value to understanding

the real ranking problem than the other approaches where there are mismatches be-

tween the output space that facilitates learning and the real output space of the task.

The hypothesis space contains multi-variate functions h that operate on a set of

documents and predict their permutation. For practical reasons, the hypothesis h

is usually implemented with a scoring function f , e.g., h( x )

f( x ) . That

is, first a scoring function f is used to give a score to each document, and then

these documents are sorted in descending order of the scores to produce the desired

permutation.

There are two types of loss functions , widely used in the listwise approach. For

the first type, the loss function is explicitly related to the evaluation measures (which

we call the measure-specific loss function), while for the second type, the loss func-

tion is not (which we call the non-measure-specific loss function). Note that some-

times it is not very easy to determine whether a loss function is listwise, since some

building blocks of a listwise loss may also seem to be pointwise or pairwise. In

this topic, we mainly distinguish a listwise loss from a pointwise or pairwise loss

according to the following criteria:

=

sort

◦

•

A listwise loss function is defined with respect to all the training documents as-

sociated with a query.

•

A listwise loss function cannot be fully decomposed to simple summation over

individual documents or document pairs.

•

A listwise loss function emphasizes the concept of a ranked list, and the positions

of the documents in the final ranking result are visible to the loss function.

Just because of these properties of the loss function, the listwise approach is in

more accordance with the ranking task in information retrieval than the pointwise

and pairwise approaches.

Example algorithms that belong to the listwise approach include [ 10 - 12 , 38 , 59 ,

72 , 80 , 82 , 84 , 85 , 88 , 89 , 96 ]. We will introduce some of them in Chap. 4.

To sum up, we list the key components for each approach to learning to rank in

Table 1.2 . In this table, for simplicity, we assume the use of a scoring function f in

the hypothesis of all the approaches, although it is not always necessary to be the

case.

It should be noted that although the scoring function is a kind of “pointwise”

function, it is not to say that all the approaches are in nature pointwise approaches.

The categorization of the aforementioned three approaches is mainly based on the

four pillars of machine learning. That is, different approaches regard the same train-

ing data as in different input and output spaces, and use different loss functions and

hypotheses. Accordingly, they will have difference theoretical properties. We will

make more discussions on this in Part VI, with the introduction of a new theory

called the statistical learning theory for ranking .

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home