The Pairwise Approach - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

the loss function, such as the Bradley-Terry model and the Thurstone-Mosteller

model [ 6 , 32 ]. Again, a gradient boosting tree is used as the algorithm to minimize

the loss function and learn the ranking function.

3.3 Improved Algorithms

It seems that the pairwise approach has its advantages as compared to the pointwise

approach, since it models the relative order between documents rather than the ab-

solute value of the relevance degree of a document. However, the pairwise approach

also has its disadvantages.

1. When we are given the relevance judgment in terms of multiple ordered cat-

egories, however, converting it to pairwise preference will lead to the loss of

information about the finer granularity in the relevance judgment. Since any two

documents with different relevance degrees can construct a document pair, we

may have such a document pair whose original relevance degrees are Excellent

and Bad , and another document pair whose original relevance degrees are Fair

and Bad . It is clear that these two pairs represent different magnitudes of pref-

erences; however, in the algorithms introduced above, they are treated equally

without distinction.

2. Since the number of pairs can be as large as in the quadratic order of the num-

ber of documents, the imbalanced distribution across queries may be even more

serious for the pairwise approach than the pointwise approach.

3. The pairwise approach is more sensitive to noisy labels than the pointwise ap-

proach. That is, a noisy relevance judgment on a single document can lead to a

large number of mis-labeled document pairs.

4. Most of the pairwise ranking algorithms introduced above do not consider the

position of documents in the final ranking results, but instead define their loss

functions directly on the basis of individual document pairs.

These problems may affect the effectiveness of the pairwise approach. To tackle

the problems, several attempts have been proposed, which will be depicted as fol-

lows.

3.3.1 Multiple Hyperplane Ranker

To tackle the first problem of the pairwise approach as aforementioned, Qin et

al. [ 26 ] propose using a new algorithm named Multiple Hyperplane Ranker (MHR).

The basic idea is “divide-and-conquer”. Suppose there are K different categories

of judgments, then one can train K(K

1 )/ 2 Ranking SVM models in total, with

each model trained from the document pairs with two categories of judgments (see

Fig. 3.8 ). At the test phase, rank aggregation is used to merge the ranking results

given by each model to produce the final ranking result. For instance, suppose that

−

Search WWH ::

Custom Search

Home