Statistical Learning Theory for Ranking - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

two-layer ranking, respectively. We will make detailed introductions to these three

settings in the next chapter.

15.3.2 Generalization Analysis for Ranking

Given the existence of queries and documents in learning to ranking for informa-

tion retrieval, when talking about the generalization ability of an algorithm, one

needs to pay attention to both the number of training queries and that of training

documents. For example, a natural question to answer is as follows: what are the

minimum numbers of training queries and training documents in order to guaran-

tee a target difference between training error and test error? If the total number of

documents to label is constrained, shall we label more queries (shallow labeling) or

more documents per query (deep labeling)?

In the literature of learning to rank, some empirical studies have been conducted

to answer the above questions, however, most existing work on generalization anal-

ysis for ranking cannot answer these questions yet. As one will see in the next chap-

ter, in these works, either generalization bounds with respect to the number of doc-

uments (or document pairs) [ 1 - 3 , 5 , 7 , 11 ], or those with respect to the number of

queries have been derived [ 8 , 10 ]. Only in a recent work [ 4 ], an answer is given to

the trade off between the number of queries and the number of documents.

15.3.3 Statistical Consistency for Ranking

As aforementioned, the statistical consistency is with respect to a true loss for

ranking. Since the ranking task is more complex than classification, its true loss

should also consider more factors. Partly because of this, there is no consen-

sus on the true loss for ranking yet. Instead, different true losses are assumed in

different researches, including the pointwise 0-1 loss, pairwise 0-1 loss [ 5 , 7 ],

permutation-level 0-1 loss [ 13 ], top- k loss [ 12 ], and measure-based ranking er-

rors (e.g., (1

MAP)) [ 6 ]. The difficulties of theoretical analyses

regarding different true losses vary largely. Those with respect to the pointwise or

pairwise 0-1 loss are relatively easy because of the analogy to classification. The

analyses with respect to measure-based ranking errors are the most difficult. Be-

cause such true losses are defined at the query level and are position based, their

mathematical properties are not very good.

Given a true loss, in order to perform meaningful analysis on statistical consis-

tency, one needs to derive a regret bound. When the pointwise or pairwise 0-1 loss

is used as the true loss for ranking, the task is similar to that for classification. How-

ever, when more complex true loss, such as the permutation-level 0-1 loss, top- k

loss, and measure-based ranking errors are used, the task will become much more

complex. In such cases, one usually needs to make some assumptions on the input

−

NDCG) and (1

−

Search WWH ::

Custom Search

Home