Information Technology Reference
Then the gradient ascent method is used to maximize the log likelihood, and the
learned parameters are then used to rank the documents of a new query. Specifically,
given documents x for the new query,
h( x )
Pr ( y
With the above general idea of C-CRF, in [ 7 ], it is demonstrated how to define
specific forms of feature functions and the conditional probabilities for several re-
lational ranking tasks, such as topic distillation and pseudo relevance feedback. In
these specific cases, it has been shown in [ 7 ] that the optimal ranking functions have
closed-form solutions. This greatly eases both training and test processes.
For example, the optimal ranking function for pseudo relevance feedback takes
the following form:
= α T eI
βR − 1 x α,
F( x )
where e is a K 1 -dimensional all-ones vector, I is an m
m identity matrix, R is a
m diagonal matrix with D i,i = j R i,j .
For another example, the optimal ranking function for topic distillation takes the
similarity matrix, and D is an m
D c )e , (6.12)
where D r and D c are two diagonal matrices with D ri,i = j R i,j and D ci,i =
α T e 2 x α
h( x ) =
β(D r −
j R j,i .
Experimental results on the LETOR benchmark datasets show that C-CRF can
significantly outperform conventional learning-to-rank methods on the tasks of topic
distillation and pseudo relevance feedback.
6.2 Learning Diverse Ranking
Ideally, the relational ranking framework introduced in the previous subsection can
handle various relational ranking task by using the appropriate relationship to define
its objective function. However, no concrete examples have been given in [ 6 ] and
[ 7 ] on how the framework can be used to learn diverse ranking. On the other hand,
there are several independent works that have investigated the problem of search
result diversification. For example, in [ 9 ], Yue et al. investigate how to use structured
SVM to predict diverse subsets. In [ 8 ], two online learning algorithms, the ranked
explore and commit algorithm and the ranked bandits algorithm, are proposed to
directly learn diverse ranking of documents based on users' click behaviors. In [ 1 ],
a formal probabilistic formulation about search result diversification is proposed,
in the context that the topical categories of both the query and the documents are
given. In [ 4 ], axioms that any diversification system ought to satisfy are discussed.
We will introduce [ 1 ] and [ 4 ] in more detail in this subsection.