Information Technology Reference
It is clear that the larger the value is, the more likely x v is irrelevant but still
ranked on the top of the final ranked list. As mentioned before, we should punish this
situation since errors occurring at the top positions will damage the user experience.
The proposal in [ 29 ] is to push such x v down from the top. Specifically, a convex,
non-negative, monotonically increasing function g is introduced for this purpose. If
g is very steep, an extremely high price for a high-ranked irrelevant documents will
be paid. Examples of steep functions include the exponential function ( g(r)
e r )
r p , where p is large). The power function is used
in [ 29 ] as an example. Accordingly, the overall loss with regards to x v becomes
and the power function ( g(r)
x u ,x v ,y u,v ) p
x v )
u,y u,v =
After formulating the above loss function, a Boosting-style algorithm is devel-
oped to minimize the loss. Experimental results show that by introducing the extra
penalty for the top-ranked irrelevant documents the ranking performance can be
3.3.6 Ordered Weighted Average for Ranking
In [ 34 ], Usunier et al. also aim at solving the forth problem with the pairwise ap-
proach. In particular, a weighted pairwise classification method is proposed, which
emphasizes the correct ranking at the top positions of the final ranked list. Specif-
ically, convex Ordered Weighted Averaging (OWA) operators [ 35 ] are used, which
are parameterized by a set of decreasing weights α t (the weight α t is associated to
the t th highest loss), to make a weighted sum of the pairwise classification losses.
By choosing appropriate weights the convex OWA operators can lead to a loss func-
tion that focuses on the top-ranked documents.
(s 1 ,...,s m ) , its corresponding OWA is defined as follows:
owa α (s) =
α t s σ(t) ,
where σ is a permutation such that
s σ(t + 1 ) .
And accordingly, the loss functions in the pairwise ranking algorithms can be
refined as follows.
t,s σ(t) ≥
owa v,y u,v = 1 L(f
y u,v ,x u ,x v ) .
x u )
In general, the above loss associates the largest weights to the largest pairwise
losses, and thus to the document pairs containing the irrelevant documents with the
greatest relevance scores.
Experimental results demonstrate that the use of the OWA based loss can lead to
better performance than the use of the original pairwise loss functions.