Information Technology Reference
The ramp loss
timization of it will be easily trapped into a local optimum. To tackle this challenge,
the new ranker is used as a second optimization step, refining the ranking model
produced by another ranker (e.g., Ranking SVM [ 22 , 23 ]). In this way, with a rela-
tively good starting point, it is more likely that the optimization of the sigmoid loss
will lead to a reliably good solution.
The experimental results in [ 10 ] show that by using the new loss function, a better
ranking performance can be achieved.
Actually in recent years, researchers have used similar ideas to improve the ac-
curacy of classification. For example, the ramp loss (as shown in Fig. 3.10 )isused
in [ 13 ]. Like the sigmoid loss, the ramp loss also restricts the exceptional loss on
outliers. According to [ 13 ], by using the ramp loss in SVMs, the number of sup-
port vectors can be significantly reduced. Further, considering the results presented
in [ 3 ], better generalization ability can be achieved in this way.
3.3.5 P-norm Push
Although by looking at only a pair of documents one cannot determine their rank
positions in the final ranked list, one can make a reasonable estimation on the rank
position of a document by checking all the pairs containing the document. Since
top positions are important for users, one can punish those errors occurring at top
positions based on the estimation. In this way, one may be able to solve the forth
problem with the pairwise approach.
Actually this is exactly the key idea in [ 29 ]. We refer to this algorithm as P-norm
push. Suppose the pairwise loss function is L(f
x u ,x v ,y u,v ) , then for document
x v , the overall error that it is mis-ranked before other documents can be written as
x v )
x u ,x v ,y u,v ).
u,y u,v =