Statistical Consistency for Ranking - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

its counterpart; (iii) there exists a permutation, for which the speed of change in loss

with respect to the score of a document will become faster if exchanging its position

with another document with the same score but ranked lower.

Based on the two new concepts, a theorem similar to Theorem 18.3 has been

obtained, which is now shown.

Theorem 18.4 Let L φ (f

documents , if its top-k subgroup probability space is order preserving with respect

to m

;

x ,π y ) be a top-k order sensitive loss function . Fo r

∀

i =

−

1 document pairs

{

(j i ,j i + 1 )

}

1 and

{

(j k + s i ,j k + i :

≤

s i <i)

}

, then the

i =

loss L φ (f

;

x ,π y ) is consistent with the top-k true loss .

In addition to the above result, the change of statistical consistency with respect

to different k values has also been discussed in [ 12 ], and the following conclusions

are obtained.

•

For the consistency with the top- k true loss, when k becomes smaller, the re-

quirement on the probability space becomes weaker but the requirement on the

surrogate loss function becomes stronger.

•

If we fix the true loss to be top- k and the probability space to be top- k order pre-

serving, the surrogate loss function should be at most top- l ( l ≤ k ) order sensitive

in order to meet the consistency conditions.

According to Theorem 18.4 , the surrogate loss functions optimized by ListMLE

and ListNet are not consistent with the top- k true loss. Modifications to the algo-

rithms are needed to make them consistent. For this purpose, one should replace the

permutation-level Plackett-Luce model in ListMLE with the top- k Plackett-Luce

model; and change the mapping function in ListNet to retain the order for the top- k

positions in the ground-truth permutation and assign to all the remaining positions

a small value (which is smaller than the score of any document ranked at the top- k

positions). Experimental results have shown that with such modifications, the per-

formances of the algorithms can be enhanced in terms of the top- k true loss.

18.3.4 Regarding Weighted Kendall's τ

In [ 10 ], the consistency of some pairwise ranking methods and listwise ranking

methods with respect to the weighted Kendall's τ is studied. The weighted Kendall's

τ is defined as follows:

−

L 0 (f ;

x ,π y ) =

D(u,v)I { f(x π − 1

(18.7)

(u) )

−

f(x π − 1

(v) )

≤

}

where the ground truth is assumed to be a permutation π y , D(u,v) denotes a weight

function on a pair of documents whose positions are u and v in the ground-truth

permutation respectively. When D(u,v)

D(u)

−

D(v) , we call it the Difference-

Weighting Kendall's τ . When D(u,v)

1, we call it the unweighted Kendall's τ .

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home