Information Technology Reference
In-Depth Information
q , and R(f ) can
D
assumed to be a random variable with probabilistic distribution
be defined as follows:
q (dx u ,dx v ,dy u,v )P Q (dq). (16.19)
R(f ) =
L(f ; x u ,x v ,y u,v ) D
Q
X
2
× Y
q
As both the distributions P
and
D
are unknown, the following empirical risk
Q
is used to estimate R(f ) :
m (i)
n
1
n
1
m (i)
x (i)
j 1 ,x (i)
j 2 ,y (i)
R(f )
=
L(f
;
j 1 ,j 2 ).
(16.20)
i
=
1
j
=
1
16.3.3 The Listwise Approach
Note that most existing listwise ranking algorithms assume that the listwise loss
function takes all the documents associated with a query as input, and there is no
sampling of these documents. Therefore, the two-layer ranking framework does not
explain the existing listwise ranking methods in a straightforward manner. Some
modifications need to be conducted to the algorithms in order to fit them into the
framework. For simplicity, we will not discuss the marriage between the two-layer
ranking framework and the listwise approach in this topic.
16.4 Summary
In this chapter, we have introduced three major statistical ranking frameworks used
in the literature. The document ranking framework assumes the i.i.d. distribution of
documents, regardless of the queries they belong to. The subset ranking framework
ignores the sampling of documents per query and directly assumes the i.i.d. distri-
bution of queries. The two-layer ranking framework considers the i.i.d. sampling of
both queries and documents per query. It is clear that the two-layer ranking frame-
work describes the real ranking problems in a more natural way. However, the other
two frameworks can also be used to obtain certain theoretical results that can explain
the behaviors of existing learning-to-rank methods. With the three frameworks, we
give the definitions of the empirical and expected risks for different approaches to
learning to rank. These definitions will be used intensively in the following two
chapters, which are concerned with the generalization ability and statistical consis-
tency of ranking methods.
16.5 Exercises
16.1 Compare the different probabilistic assumptions of the three ranking frame-
works.
Search WWH ::




Custom Search