Information Technology Reference
In-Depth Information
Fig. 4.1 Ranking function
used in RankGP
4.2.4 Discussions
As can be seen from the above examples, various methods have been proposed to
optimize loss functions that are related to the evaluation measures. Actually there
are even more such works, which have not been introduced in detail due to space
restrictions, such as [ 23 , 26 ].
In addition to these works, in [ 35 ] it is discussed whether an evaluation measure
is suitable for direct optimization. For this purpose, a concept called informative-
ness is proposed. Empirical studies show that more informative measures can lead
to more effective ranking models. Discussions are then conducted on why some
measures are more informative while the others are less so. Specifically, according
to the discussions, multi-level measures (such as NDCG) are more informative than
binary measures (such as AP), since they respond to any flips between documents
of different relevance degrees. Furthermore, given two binary (or multi-level) mea-
sures, one measure may still be more informative than the other for the following
reasons: (1) some measures respond only to flips in some specific part of the rank-
ing, usually the top end; (2) even if two measures depend on the same part of the
ranking, one may be insensitive to some of these flips within this part due to ignoring
some flips or due to the discount functions used.
The experiments with several direct optimization methods have verified the
above discussions. As an interesting result, it is not always the best choice to op-
timize a less informative measure even it is eventually used for evaluation. For ex-
ample, directly optimizing AP can lead to a better ranking performance in terms of
P@10 than directly optimizing P@10.
4.3 Minimization of Non-measure-Specific Loss
In the second sub-category of the listwise approach, the loss function, which is not
measure specific, reflects the inconsistency between the output of the ranking model
and the ground truth permutation π y . Although evaluation measures are not directly
optimized here, if one can consider the distinct properties of ranking in information
retrieval in the design of the loss function, it is also possible that the model learned
can have good performance in terms of evaluation measures.
Example algorithms in this sub-category include ListNet [ 4 ], ListMLE [ 31 ],
StructRank [ 16 ], and BoltzRank [ 30 ]. We will give introductions to them in this
Search WWH ::

Custom Search