Transfer Ranking - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

and target domains share the same set of features, and their differences only lie in

the different distributions of the data. Denote the source and target distributions by

P s and P t , respectively, then we have

arg min

w t =

L(w

;

x , y )P t ( x , y )d x d y

arg min P t ( x , y )

P s ( x , y ) L(w ;

x , y )P s ( x , y )d x d y

arg min

P t ( x )

P s ( x )

P t ( y

x )

x ) L(w ;

x , y )P s ( x , y )d x d y .

(9.2)

P s ( y

P t ( y

x )

P t ( x )

Let δ =

and η =

P s ( x ) ; one obtains

P s ( y

x )

w t =

arg min

δηL(w

;

x , y )P s ( x , y )d x d y .

(9.3)

In other words, with re-weighting factors, the minimization of the loss on the

source domain can also lead to the optimal ranking function on the target domain.

Therefore, in practice, w t can be learned by minimizing the re-weighted empirical

risk on the source-domain data:

n s

δ i η i L w

, y (i s ,

x (i)

w t =

;

arg min

(9.4)

where the subscript s means the source domain, and n s is the number of queries in

the source-domain data.

In [ 1 ], it is assumed that η i =

1. In other words, there is no difference in the

distribution of x for the source and target domains. To set δ i , the following heuristic

method is used. First a ranking model is trained from the target-domain data and

then it is tested on the source-domain data. If a pair of documents in the source-

domain data is ranked correctly, the corresponding pair is retained and assigned

with a weight; else, it is discarded. Since in learning to rank each document pair

is associated with a specific query, the pairwise precision of this query is used to

determine δ i :

# pairs correctly ranked of a query

# total pairs of a query

δ i =

According to the experimental results in [ 1 ], the instance-level method only

works well for certain datasets. On some other datasets, its performance is even

worse than only using the target-domain data. This in a sense shows that simple

re-weighting might not effectively bridge the gap between the source and target do-

mains.

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home