Information Technology Reference
The above research direction is related to rank aggregation. In rank aggregation,
the input space consists of permutations, and the output space also consists of per-
mutations. There is no feature representation for each single document. The task is
to learn the optimal way of aggregating the candidate ranked lists in the input space,
to generate the desired ranked list in the output space. There have been several pre-
vious works [ 7 , 9 ] discussing how to use permutation probability models (e.g., the
Plackett-Luce model and the Mallows model) to perform unsupervised rank aggre-
gation. It would be an interesting topic to study how to further leverage labeled data
and develop supervised versions for these algorithms. One preliminary effort can be
found in [ 11 ], and we look forward to seeing more such attempts.
Furthermore, most existing work on learning to rank belongs to discriminative
learning. However, as we notice, generative learning is also a very important branch
of machine learning, which is good at modeling causal relationship or dependency
between random variables. There is no reason that generative learning cannot be
used in ranking. This could be a promising research direction of learning to rank,
from both algorithmic and theoretical points of view.
20.5 Large-Scale Learning to Rank
In the literature of learning to rank, researchers have paid a lot of attention to the
design of loss functions, but somehow overlooked the efficiency and scalability
of algorithms. The latter, however, has become a more and more important issue
nowadays, especially due to the availability of large-scale click-through data in web
search that can be used to train the learning-to-rank models.
While it is good to have more training data, it is challenging for many existing
algorithms to handle such data. In order to tackle the challenge, we may want to
consider one of the following approaches.
Parallel computing . For example, we can use the MPI or MapReduce infrastruc-
ture to distribute the computations in the algorithms. There are a number of at-
tempts on distributed machine learning in the literature [ 4 , 5 ], however, the efforts
on learning to rank are still very limited.
Ensemble learning . We can down-sample the data to make it easy to handle by a
single-machine algorithm. After learning a ranking model based on this sample,
we can repeat the sampling for multiple times and aggregate the ranking models
obtained from all the samples. It has been proven in [ 16 ] that such ensemble
learning can effectively make use of large datasets and the resultant model can be
more effective than the model learned from every single sample.
Approximate algorithms . In some cases, we can derive an approximate version
of an existing algorithm, whose complexity is much lower than the original algo-
rithm while the accuracy remains to a certain extent. This kind of approaches has
been well studied in the literature of computational theory [ 15 ], but has still not
been sufficiently investigated in learning to rank.