Information Technology Reference
In-Depth Information
To incorporate this kind of query difference, a position-sensitive query-dependent
loss function is proposed in [ 2 ]. In particular, for the navigational and transactional
query, the loss function focuses on that exact relevant page; while for the informa-
tional query, the loss considers those relevant pages which should be ranked in the
range of top- k positions.
Accordingly, the query-dependent loss can be written as follows.
+ 1
α(q) L(f
q,c I )
q,c N ),
where c I denotes informational queries and c N denotes navigational and transac-
tional queries, α(q)
P(c I |
q) represents the probability that q is an informational
query; L(f
q,c) is defined with a position function Φ(q,c) , i.e.,
x j ,y j )I { π y (x j ) Φ(q,c), π y Ω y } ,
where Ω y is the equivalent permutation set of the ground truth label, Φ(q,c) is a
set of ranking positions on which users expect high result accuracy. Specifically,
Φ(q,c I )
In order to make the above query-dependent loss functions optimizable, a set
of technologies are proposed, including learning the query classifier (based on a
set of query features) and the ranking model (based on a set of query-document
matching features) in a unified manner. The experimental results have shown that
by applying the idea of query-dependent loss to existing algorithms like RankNet
[ 4 ] and ListMLE [ 11 ], the ranking performances can be significantly improved.
1 ,...,k
and Φ(q,c N )
7.2 Query-Dependent Ranking Function
In addition to the use of a query-dependent loss function, researchers have also
investigated the use of different ranking functions for different (kinds of) queries.
7.2.1 Query Classification-Based Approach
In [ 9 ], Kang and Kim classify queries into categories based on search intentions and
build different ranking models accordingly.
In particular, user queries are classified into two categories, the topic relevance
task and the homepage finding task. In order to perform effective query classifica-
tion, the following information is utilized:
Distribution difference: Two datasets are constructed, one for topic relevance and
the other for homepage finding. Then two language models are built from them.
For a query, the difference in the probabilities given by the two models is used as
a feature for query classification.
Search WWH ::

Custom Search