Information Technology Reference
InDepth Information
and target domains share the same set of features, and their differences only lie in
the different distributions of the data. Denote the source and target distributions by
P
s
and
P
t
, respectively, then we have
arg min
w
t
=
L(w
;
x
,
y
)P
t
(
x
,
y
)d
x
d
y
arg min
P
t
(
x
,
y
)
=
P
s
(
x
,
y
)
L(w
;
x
,
y
)P
s
(
x
,
y
)d
x
d
y
arg min
P
t
(
x
)
P
s
(
x
)
P
t
(
y

x
)
=
x
)
L(w
;
x
,
y
)P
s
(
x
,
y
)d
x
d
y
.
(9.2)
P
s
(
y

P
t
(
y

x
)
P
t
(
x
)
Let
δ
=
and
η
=
P
s
(
x
)
; one obtains
P
s
(
y

x
)
w
t
=
arg min
w
δηL(w
;
x
,
y
)P
s
(
x
,
y
)d
x
d
y
.
(9.3)
In other words, with reweighting factors, the minimization of the loss on the
source domain can also lead to the optimal ranking function on the target domain.
Therefore, in practice,
w
t
can be learned by minimizing the reweighted empirical
risk on the sourcedomain data:
n
s
δ
i
η
i
L
w
,
y
(i
s
,
x
(i)
w
t
=
;
arg min
w
(9.4)
s
i
=
1
where the subscript
s
means the source domain, and
n
s
is the number of queries in
the sourcedomain data.
In [
1
], it is assumed that
η
i
=
1. In other words, there is no difference in the
distribution of
x
for the source and target domains. To set
δ
i
, the following heuristic
method is used. First a ranking model is trained from the targetdomain data and
then it is tested on the sourcedomain data. If a pair of documents in the source
domain data is ranked correctly, the corresponding pair is retained and assigned
with a weight; else, it is discarded. Since in learning to rank each document pair
is associated with a specific query, the pairwise precision of this query is used to
determine
δ
i
:
# pairs correctly ranked of a query
# total pairs of a query
δ
i
=
.
According to the experimental results in [
1
], the instancelevel method only
works well for certain datasets. On some other datasets, its performance is even
worse than only using the targetdomain data. This in a sense shows that simple
reweighting might not effectively bridge the gap between the source and target do
mains.