Information Technology Reference
In-Depth Information
where
sigmoid
t
b i
θ u,t,i x u,t +
h i (x u ,x v )
=
θ v,t,i x v,t +
sigmoid
t
b i
θ u,t,i x v,t +
=
θ v,t,i x u,t +
=
h i (x v ,x u ).
(3.6)
Then, the optimal parameters θ , w , and b are learned by minimizing the follow-
ing loss function:
y u,v
x v ) 2
y v,u
x v ) 2 .
L(h ; x u ,x v ,y u,v )
=
P(x u
+
P(x u
(3.7)
For testing, the learned preference function is used to generate pairwise prefer-
ences for all possible document pairs. Then an additional sorting (aggregation) step,
just as in [ 12 ], is used to resolve the conflicts in these pairwise preferences and to
generate a final ranked list.
3.2.3 RankNet: Learning to Rank with Gradient Descent
RankNet [ 8 ] is one of the learning-to-rank algorithms used by commercial search
engines. 1
In RankNet, the loss function is also defined on a pair of documents, but the
hypothesis is defined with the use of a scoring function f . Given two documents x u
and x v associated with a training query q , a target probability
P u,v is constructed
P u,v =
based on their ground truth labels. For example, we can define
1, if y u,v =
1;
P u,v =
0, otherwise. Then, the modeled probability P u,v is defined based on the
difference between the scores of these two documents given by the scoring function,
i.e.,
exp (f (x u ) f(x v ))
P u,v (f ) =
f(x v )) .
(3.8)
1
+
exp (f (x u )
Then the cross entropy between the target probability and the modeled proba-
bility is used as the loss function, which we refer to as the cross entropy loss for
short.
P u,v ) log 1
P u,v (f ) .
L(f ; x u ,x v ,y u,v ) =− P u,v log P u,v (f ) ( 1
(3.9)
1 As far as we know, Microsoft Bing Search ( http://www.bing.com/ ) is using the model trained with
a variation of RankNet.
Search WWH ::




Custom Search