The Pairwise Approach - Learning to Rank for Information Retrieval - page 62

Information Technology Reference

In-Depth Information

Fig. 3.9 The sigmoid

function based loss ( σ = 5)

relevance judgment y (i)

u,v , IR-SVM solves the following optimization problem,

u,v : y (i)

1 ξ (i)

n

min 1

u,v

=

2

u,v

2

w

+

λ

m (i)

˜

i

=

1

w T x (i)

x (i v ≥

ξ (i)

if y (i)

−

−

u,v =

s.t.

1

u,v ,

1

(3.24)

u

ξ (i)

u,v ≥

;

=

;

0

i

1 ,...,n

m (i) is the number of document pairs associated with query q i .

According to the experimental results in [ 9 ], a significant performance improve-

ment has been observed after the query-level normalization is introduced.

where

˜

3.3.4 Robust Pairwise Ranking with Sigmoid Functions

In [ 10 ], Carvalho et al. try to tackle the third problem with the pairwise approach as

aforementioned, i.e., the sensitivity to noisy relevance judgments.

Using some case studies, Carvalho et al. point out that the problem partly comes

from the shapes of the loss functions. For example, in Ranking SVM [ 22 , 23 ], the

hinge loss is used. Due to its shape, outliers that produce large negative scores will

have a strong contribution to the overall loss. In turn, these outliers will play an

important role in determining the final learned ranking function.

To solve the problem, a new loss based on the sigmoid function (denoted as the

sigmoid loss) is proposed to replace the hinge loss for learning to rank. Specifically

the sigmoid loss has the following form (see Fig. 3.9 ):

1

L(f ; x u ,x v ,y u,v ) =

e − σy u,v (f (x u ) − f(x v )) .

(3.25)

1

+

While the new loss function solves the problem of the hinge function, it also

introduces a new problem. That is, the sigmoid loss is non-convex and thus the op-

Next Page

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home