Information Technology Reference
In-Depth Information
A standard way of addressing the sample selection bias is to reweight the train-
ing samples such that reweighted training distribution matches the test distribution.
These weights can be found through logistic regression [ 2 ]. Once the weights have
been estimated, they can readily be incorporated into a pointwise learning-to-rank
algorithm. How to use the weights in a pairwise or listwise algorithm is an inter-
esting research question. Another way of correcting this sample selection bias is to
improve the scheme used for collecting training data. The pooling strategy could
for instance be modified to include documents deeper in the ranking, thus reflecting
more closely the test distribution. But judging more documents has a cost and this
brings the question of how to select the training documents under a fixed labeling
budget. This has been discussed in Sect. 13.3.
Note that the sample selection bias is related to transfer learning as both deal
with the case of different training and test distributions. But in the sample selection
bias, even though the marginal distribution P(x) changes between training and test,
the conditional output distribution P(y
x) is assumed to be fixed. In most transfer
learning scenarios, this conditional output distribution shifts between training and
test.
|
20.2 Direct Learning from Logs
In addition to the sample selection bias, there is another issue with the training sets
for learning to rank. That is, the human-labeled training sets are of relatively small
scale. Considering the huge query space, even hundreds or thousands of queries
might not reasonably guarantee the effectiveness of a learning-to-rank algorithm.
Developing very large-scale datasets is very important in this regard.
Click-through log mining is one of the possible approaches to achieve this goal.
Several works have been done along this direction, as reviewed in Sect. 13.2.How-
ever these works also have certain limitations. Basically they have tried to map the
click-through logs to judgments in terms of pairwise preferences or multiple ordered
categories. However, this process is not always necessary (and sometimes even not
reasonable). Note that there is rich information in the click-through logs, e.g., the
user sessions, the frequency of clicking a certain document, the frequency of a cer-
tain click pattern, and the diversity in the intentions of different users. After the
conversion of the logs to pointwise or pairwise judgments, much of the aforemen-
tioned information will be missing. Therefore, it is meaningful to reconsider the
problem, and probably change the learning algorithms to adapt to the log data. For
example, one can directly regard the click-through logs (without conversion) as the
ground truth, and define loss functions based on the likelihood of the log data.
Furthermore, unlike the human-labeled data, click-through logs are of their na-
ture streaming data generated all the time as long as users visit the search engine.
Therefore, learning from click-through logs should also be a online process. That
is, when the training data shift from human-labeled data to click-through data, the
learning scheme should also be changed from offline learning to online learning.
Search WWH ::




Custom Search