Statistical Learning Theory for Ranking - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

In order to select the appropriate model, one needs to investigate the complex-

ity of the corresponding function class. If the function class is too complex, then

the model from this class will likely over fit even if trained with a large number of

instances. Usually a well-defined complexity will determine the uniform general-

ization bound for a learning problem. That is, no matter which algorithm one uses,

it is always a bound of the generalization ability for the algorithm.

In order to select the appropriate surrogate loss function, one needs to study

whether the optimal model learned by minimizing the empirical surrogate risk will

be the same as the ranking model with the minimum expected true risk. This prop-

erty is called statistical consistency. To perform consistency analysis, one usually

needs to derive a regret bound to bridge the surrogate loss and the true loss, and

then once again apply the result of generalization analysis to bridge the empirical

risk and expected risk.

From the above discussions, one can see that the statistical framework, general-

ization analysis, and statistical consistency are the key issues to be addressed for a

learning theory. When talking about ranking, the corresponding issues will become

the statistical ranking framework, generalization analysis for ranking , and statistical

consistency of ranking methods.

Note that the above issues are critical for all machine learning problems. How-

ever, due to the differences in different machine learning problems, the correspond-

ing conclusions and proof techniques regarding these issues may be significantly

different. Just because of this, one cannot simply extend the existing results for

classification to that for ranking, and new theories and proof techniques need to be

developed. In the following section, we will make brief discussions on this, and

show recent advances in statistical learning theory for ranking.

15.3 Learning Theory for Ranking

In this section, we will introduce the key issues regarding statistical learning theory

for ranking.

15.3.1 Statistical Ranking Framework

In the literature of learning to rank, several different statistical frameworks have

been used. For example, when conducting theoretical analysis on ranking methods,

some researchers assume that all the documents are i.i.d. no matter they are associ-

ated with the same query or not [ 1 - 3 , 5 , 7 , 11 ]. Some researchers regard documents

as deterministically given, and only treat queries as i.i.d. random variables [ 6 , 8 ].

Some other researchers assume that the queries are i.i.d. in the query space, and

the documents associated with the same query are conditionally i.i.d. (the distribu-

tion depends on the query) [ 4 , 10 ]. These different assumptions result in different

settings of learning to rank, which we call document ranking, subset ranking, and

Search WWH ::

Custom Search

Home