Information Technology Reference
In-Depth Information
search engine (see Fig. 1.2 ). If one wants to know more about widely used features,
please refer to Tables 10.2 and 10.3 in Chap. 10.
Even if a feature is the output of an existing retrieval model, in the context of
learning to rank, one assumes that the parameter in the model is fixed, and only
the optimal way of combining these features is learned. In this sense, the previous
works on automatically tuning the parameters of existing models [ 36 , 75 ] are not
categorized as “learning-to-rank” methods.
The capability of combining a large number of features is an advantage of
learning-to-rank methods. It is easy to incorporate any new progress on a retrieval
model by including the output of the model as one dimension of the features. Such a
capability is highly demanding for real search engines, since it is almost impossible
to use only a few factors to satisfy the complex information needs of Web users.
Discriminative Training Discriminative training ” means that the learning pro-
cess can be well described by the four components of discriminative learning as
mentioned in the previous subsection. That is, a learning-to-rank method has its
own input space, output space, hypothesis space, and loss function.
In the literature of machine learning, discriminative methods have been widely
used to combine different kinds of features, without the necessity of defining a prob-
abilistic framework to represent the generation of objects and the correctness of pre-
diction. In this sense, previous works that train generative ranking models are not
categorized as “learning-to-rank” methods in this topic. If one has interest in such
works, please refer to [ 45 , 52 , 93 ], etc.
Discriminative training is an automatic learning process based on the training
data. This is also highly demanding for real search engines, because everyday these
search engines will receive a lot of user feedback and usage logs. It is very impor-
tant to automatically learn from the feedback and constantly improve the ranking
mechanism.
Due to the aforementioned two characteristics, learning to rank has been widely
used in commercial search engines, 18 and has also attracted great attention from the
academic research community.
1.3.3 Learning-to-Rank Framework
Figure 1.6 shows the typical “learning-to-rank” flow. From the figure we can see that
since learning to rank is a kind of supervised learning, a training set is needed. The
creation of a training set is very similar to the creation of the test set for evaluation.
For example, a typical training set consists of n training queries q i (i
=
1 ,...,n) ,
x (i)
j
m (i)
j
their associated documents represented by feature vectors x (i)
={
}
(where
=
1
18 See
http://blog.searchenginewatch.com/050622-082709 ,
http://blogs.msdn.com/msnsearch/
archive/2005/06/21/431288.aspx ,
d http://glinden.blogspot.com/2005/06/msn-search-and-
learning-to-rank.html .
Search WWH ::




Custom Search