Information Technology Reference
In-Depth Information
3.2.7 GBRank
In [ 36 ], a pairwise ranking method based on Gradient Boosting Tree [ 20 ]ispro-
posed. Basically the following loss function is used,
2 max 0 ,y u,v f(x v )
f(x u ) 2 .
1
L(f
;
x u ,x v ,y u,v )
=
(3.18)
The above loss function shows that if the predicted preference by scoring func-
tion f is inconsistent with the ground truth label y u,v , there will be a square loss.
Otherwise there is no loss. Since direct optimization of the above loss is difficult,
the basic idea is to fix either one of the values f(x u ) and f(x v ) , e.g., replace either
one of the function values by its current predicted value, and solve the remaining
problem by means of regression.
To avoid obtaining an optimal scoring function which is constant, a regularization
item is further introduced as follows,
2 max 0 ,y u,v f(x v )
f(x u )
τ 2
1
λτ 2 .
L(f ; x u ,x v ,y u,v )
=
+
(3.19)
To summarize, Algorithm 2 named GBRank is used to perform the ranking func-
tion learning.
Algorithm 2 : Learning algorithm for GBRank
Input : training data in terms of document pairs
Given : initial guess of the ranking function f 0 .
For t
1 ,...,T
using f t 1 as the current approximation of f , we separate the training data
into two disjoint sets.
S + ={
=
x u ,x v |
f t 1 (x u )
f t 1 (x v )
+
τ
}
S ={
}
fitting a regression function g t (x) using gradient boosting tree [ 20 ] and the
following training data
{
x u ,x v |
f t 1 (x u )<f t 1 (x v )
+
τ
S }
+
|
x u ,x v
(x u ,f t 1 (x v )
τ),(x v ,f t 1 (x u )
τ)
forming (with normalization of the range of f t )
f t (x)
t
·
f t 1 (x)
+
ηg t (x)
=
t +
1
where η is a shrinkage factor.
Output : f(x)
=
f T (x) .
In [ 38 ], some improvements over GBRank are made. Specifically, quadratic ap-
proximation to the loss function in GBRank is conducted based on Taylor expansion,
and then an upper bound of the loss function, which is easier to optimize, is derived.
Experimental results show that the new method can outperform GBRank in many
settings.
In [ 39 ], the idea of GBRank is further extended, and those pairs of documents
with the same label (i.e., the so-called ties) are also considered in the loss func-
tion. For this purpose, well-known pairwise comparison models are used to define
Search WWH ::




Custom Search