Information Technology Reference
In-Depth Information
two-layer ranking, respectively. We will make detailed introductions to these three
settings in the next chapter.
15.3.2 Generalization Analysis for Ranking
Given the existence of queries and documents in learning to ranking for informa-
tion retrieval, when talking about the generalization ability of an algorithm, one
needs to pay attention to both the number of training queries and that of training
documents. For example, a natural question to answer is as follows: what are the
minimum numbers of training queries and training documents in order to guaran-
tee a target difference between training error and test error? If the total number of
documents to label is constrained, shall we label more queries (shallow labeling) or
more documents per query (deep labeling)?
In the literature of learning to rank, some empirical studies have been conducted
to answer the above questions, however, most existing work on generalization anal-
ysis for ranking cannot answer these questions yet. As one will see in the next chap-
ter, in these works, either generalization bounds with respect to the number of doc-
uments (or document pairs) [ 1 - 3 , 5 , 7 , 11 ], or those with respect to the number of
queries have been derived [ 8 , 10 ]. Only in a recent work [ 4 ], an answer is given to
the trade off between the number of queries and the number of documents.
15.3.3 Statistical Consistency for Ranking
As aforementioned, the statistical consistency is with respect to a true loss for
ranking. Since the ranking task is more complex than classification, its true loss
should also consider more factors. Partly because of this, there is no consen-
sus on the true loss for ranking yet. Instead, different true losses are assumed in
different researches, including the pointwise 0-1 loss, pairwise 0-1 loss [ 5 , 7 ],
permutation-level 0-1 loss [ 13 ], top- k loss [ 12 ], and measure-based ranking er-
rors (e.g., (1
MAP)) [ 6 ]. The difficulties of theoretical analyses
regarding different true losses vary largely. Those with respect to the pointwise or
pairwise 0-1 loss are relatively easy because of the analogy to classification. The
analyses with respect to measure-based ranking errors are the most difficult. Be-
cause such true losses are defined at the query level and are position based, their
mathematical properties are not very good.
Given a true loss, in order to perform meaningful analysis on statistical consis-
tency, one needs to derive a regret bound. When the pointwise or pairwise 0-1 loss
is used as the true loss for ranking, the task is similar to that for classification. How-
ever, when more complex true loss, such as the permutation-level 0-1 loss, top- k
loss, and measure-based ranking errors are used, the task will become much more
complex. In such cases, one usually needs to make some assumptions on the input
NDCG) and (1
Search WWH ::




Custom Search