Information Technology Reference

In-Depth Information

Fig. 4.1
Ranking function

used in RankGP

4.2.4 Discussions

As can be seen from the above examples, various methods have been proposed to

optimize loss functions that are related to the evaluation measures. Actually there

are even more such works, which have not been introduced in detail due to space

restrictions, such as [
23
,
26
].

In addition to these works, in [
35
] it is discussed whether an evaluation measure

is suitable for direct optimization. For this purpose, a concept called informative-

ness is proposed. Empirical studies show that more informative measures can lead

to more effective ranking models. Discussions are then conducted on why some

measures are more informative while the others are less so. Specifically, according

to the discussions, multi-level measures (such as NDCG) are more informative than

binary measures (such as AP), since they respond to any flips between documents

of different relevance degrees. Furthermore, given two binary (or multi-level) mea-

sures, one measure may still be more informative than the other for the following

reasons: (1) some measures respond only to flips in some specific part of the rank-

ing, usually the top end; (2) even if two measures depend on the same part of the

ranking, one may be insensitive to some of these flips within this part due to ignoring

some flips or due to the discount functions used.

The experiments with several direct optimization methods have verified the

above discussions. As an interesting result, it is not always the best choice to op-

timize a less informative measure even it is eventually used for evaluation. For ex-

ample, directly optimizing AP can lead to a better ranking performance in terms of

P@10 than directly optimizing P@10.

4.3 Minimization of Non-measure-Specific Loss

In the second sub-category of the listwise approach, the loss function, which is not

measure specific, reflects the inconsistency between the output of the ranking model

and the ground truth permutation
π
y
. Although evaluation measures are not directly

optimized here, if one can consider the distinct properties of ranking in information

retrieval in the design of the loss function, it is also possible that the model learned

can have good performance in terms of evaluation measures.

Example algorithms in this sub-category include ListNet [
4
], ListMLE [
31
],

StructRank [
16
], and BoltzRank [
30
]. We will give introductions to them in this

section.