Information Technology Reference
The Pairwise Approach
Abstract In this chapter we will introduce the pairwise approach to learning to
rank. Specifically we first introduce several example algorithms, whose major differ-
ences are in the loss functions. Then we discuss the limitations of these algorithms
and present some improvements that enable better ranking performance.
The pairwise approach does not focus on accurately predicting the relevance degree
of each document; instead, it cares about the relative order between two documents.
In this sense, it is closer to the concept of “ranking” than the pointwise approach.
In the pairwise approach, ranking is usually reduced to a classification on docu-
ment pairs, i.e., to determine which document in a pair is preferred. That is, the goal
of learning is to minimize the number of miss-classified document pairs. In the ex-
treme case, if all the document pairs are correctly classified, all the documents will
be correctly ranked. Note that this classification differs from the classification in the
pointwise approach, since it operates on every two documents under investigation.
A natural concern is that document pairs are not independent, which violates the
basic assumption of classification. The fact is that although in some cases this as-
sumption really does not hold, one can still use classification technology to learn the
ranking model. However, a different theoretical framework is needed to analyze the
generalization of the learning process. We will make discussions on this in Part VI.
There are many pairwise ranking algorithms proposed in the literature of learning
to rank, based on neural network [ 8 ], perceptron [ 21 , 30 ], Boosting [ 18 ], support
vector machines [ 22 , 23 ], and other learning machines [ 12 , 33 , 36 , 37 ]. In the rest
of this chapter, we will introduce several examples of them.
3.2 Example Algorithms
3.2.1 Ordering with Preference Function
In [ 12 ], a hypothesis h(x u ,x v ) directly defined on a pair of documents is studied
(i.e., without use of the scoring function f ). In particular, given two documents x u