Information Technology Reference
In-Depth Information
Again, for ease of discussion, we sometimes also write NDCG as NDCG (π, y )
or NDCG (f, x , y ) (when π
=
sort
f( x ) ).
4.2 Minimization of Measure-Specific Loss
It might be the most straightforward choice to learn the ranking model by directly
optimizing what is used to evaluate the ranking performance. This is exactly the mo-
tivation of the first sub-category of the listwise approach, i.e., the direct optimization
methods. However, the task is not as trivial as it seems. As mentioned before, evalu-
ation measures such as NDCG and MAP are position based, and thus discontinuous
and non-differentiable [ 26 , 33 ]. The difficulty of optimizing such objective func-
tions stems from the fact that most existing optimization techniques were developed
to handle continuous and differentiable cases.
To tackle the challenges, several attempts have been made. First, one can choose
to optimize a continuous and differentiable approximation of the measure-based
ranking error. By doing so, many existing optimization technologies can be lever-
aged. Example algorithms include SoftRank [ 27 ], Approximate Rank [ 25 ], and
SmoothRank [ 7 ]. Second, one can alternatively optimize a continuous and differ-
entiable (and sometimes even convex) bound of the measure-based ranking error.
Example algorithms include SVM map [ 36 ], SVM ndcg [ 5 ], and PermuRank [ 33 ]. 1
Third, one can choose to use optimization technologies that are able to optimize non-
smooth objectives. For example, one can leverage the Boosting framework for this
purpose (the corresponding algorithm is called AdaRank [ 32 ]), or adopt the genetic
algorithm for the optimization (the corresponding algorithm is called RankGP [ 34 ]).
4.2.1 Measure Approximation
4.2.1.1 SoftRank
In SoftRank [ 27 ], it is assumed that the ranking of the documents is not simply
determined by sorting according to the scoring function. Instead, it introduces ran-
domness to the ranking process by regarding the real score of a document as a
random variable whose mean is the score given by the scoring function. In this way,
it is possible that a document can be ranked at any position, of course with dif-
ferent probabilities. For each such possible ranking, one can compute an NDCG
value. Then the expectation of NDCG over all possible rankings can be used as an
smooth approximation of the original evaluation measure NDCG. The detailed steps
to achieve this goal are elaborated as follows.
1 Actually, this trick has also been used in classification. That is, since the 0-1 classification loss is
non-differentiable, convex upper bounds like the exponential loss have been used instead.
Search WWH ::




Custom Search