The Pairwise Approach - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

The MP loss is defined as follows:

= f(x u )

y u,v

f(x v ) −

β .

L MP (f

;

x u ,x v ,y u,v )

−

(3.21)

The HMP loss is defined as follows:

0 , u,v (f (x u ) − f(x v )) ≥

0 ,

;

L HMP (f

x u ,x v ,y u,v )

(3.22)

β ,

−

y u,v |

(f (x u )

f(x v ))

otherwise .

The SVR loss is defined as follows:

0 ,

y u,v −

(f (x u )

−

f(x v ))

<ε,

L SVR (f ; x u ,x v ,y u,v ) =

(3.23)

β ,

| (f (x u ) − f(x v )) − y u,v − ε |

otherwise .

The differences between the three loss functions lie in the different conditions

that they penalize a mis-ranked pair (but not to what degree). For example, for the

MP loss, not only the mis-ranked pairs but also the correctly-ranked pairs will be

penalized if their magnitude of the predicted preference is too large; for the HMP

loss, only the miss-ranked pairs are penalized; for the SVR loss, only if the magni-

tude of the predicted preference is different from the labeled preferences to a certain

degree, the pair (no matter correctly or mis-ranked) will be penalized.

Then a L 2 regularization term is introduced to these loss functions, and the loss

functions are optimized using kernel methods. Experimental results show that the

magnitude-preserving loss functions can lead to better ranking performances than

the original pairwise ranking algorithms, such as RankBoost [ 18 ].

3.3.3 IR-SVM

According to the second problem of the pairwise approach as mentioned above, the

difference in the numbers of document pairs of different queries is usually signifi-

cantly larger than the difference in the number of documents. This phenomenon has

been observed in some previous studies [ 9 , 27 ].

In this case, the pairwise loss function will be dominated by the queries with

a large number of document pairs, and as a result the pairwise loss function will

become inconsistent with the query-level evaluation measures. To tackle the prob-

lem, Cao et al. [ 9 ] propose introducing query-level normalization to the pairwise

loss function. That is, the pairwise loss for a query will be normalized by the total

number of document pairs associated with that query. In this way, the normalized

pairwise losses with regards to different queries will become comparable to each

other in their magnitude, no matter how many document pairs they are originally

associated with. With this kind of query-level normalization, Ranking SVM will

become a new algorithm, referred to as IR-SVM [ 9 ]. Specifically, given n training

queries

1 , their associated document pairs (x (i u ,x (i v ) , and the corresponding

{

q i }

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home