Query-Dependent Ranking - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

To incorporate this kind of query difference, a position-sensitive query-dependent

loss function is proposed in [ 2 ]. In particular, for the navigational and transactional

query, the loss function focuses on that exact relevant page; while for the informa-

tional query, the loss considers those relevant pages which should be ranked in the

range of top- k positions.

Accordingly, the query-dependent loss can be written as follows.

+ 1

α(q) L(f

L(f

;

q)

=

α(q)L(f

;

q,c I )

−

;

q,c N ),

(7.5)

where c I denotes informational queries and c N denotes navigational and transac-

tional queries, α(q)

=

P(c I |

q) represents the probability that q is an informational

query; L(f

;

q,c) is defined with a position function Φ(q,c) , i.e.,

m

L(f

;

q,c)

=

;

L(f

x j ,y j )I { π y (x j ) ∈ Φ(q,c), ∃ π y ∈ Ω y } ,

(7.6)

j

=

1

where Ω y is the equivalent permutation set of the ground truth label, Φ(q,c) is a

set of ranking positions on which users expect high result accuracy. Specifically,

Φ(q,c I )

.

In order to make the above query-dependent loss functions optimizable, a set

of technologies are proposed, including learning the query classifier (based on a

set of query features) and the ranking model (based on a set of query-document

matching features) in a unified manner. The experimental results have shown that

by applying the idea of query-dependent loss to existing algorithms like RankNet

[ 4 ] and ListMLE [ 11 ], the ranking performances can be significantly improved.

={

1 ,...,k

}

and Φ(q,c N )

={

1

}

7.2 Query-Dependent Ranking Function

In addition to the use of a query-dependent loss function, researchers have also

investigated the use of different ranking functions for different (kinds of) queries.

7.2.1 Query Classification-Based Approach

In [ 9 ], Kang and Kim classify queries into categories based on search intentions and

build different ranking models accordingly.

In particular, user queries are classified into two categories, the topic relevance

task and the homepage finding task. In order to perform effective query classifica-

tion, the following information is utilized:

•

Distribution difference: Two datasets are constructed, one for topic relevance and

the other for homepage finding. Then two language models are built from them.

For a query, the difference in the probabilities given by the two models is used as

a feature for query classification.

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home