Machine Learning - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

The first term on the right-hand side of the above equation is called the approx-

imation error, which measures how well the best function in

G

can approach the

target. The second term, called the estimation error, measures how close g n is to the

best function in

G

. In the literature of statistical learning theory, most attention has

been paid to the estimation error, which is also the focus of our introduction in this

section.

The bound regarding the estimation error takes the form

R(g ∗ )

R(g n )

≤

+

B(n,

G

).

Considering that g n minimizes the empirical risk in

G

,wehave

R(g ∗ )

− R(g n )

≥

0 .

Then we obtain the inequality

R(g n ) = R(g n ) − R(g ∗ ) + R(g ∗ )

≤ R(g ∗ ) − R(g n ) + R(g n ) − R(g ∗ ) + R(g ∗ )

R(g)

R(g) +

R(g ∗ ).

≤

2sup

g

−

∈ G

Based on the above inequality, it is clear that if one can obtain a bound for

− R(g)

|

, one will correspondingly obtain a bound for the estimation error.

This is what many previous works have been doing.

R(g)

|

R(g)

|

−

|

22.3.2 Bounds for

R(g)

− R(g)

In order to ease the analysis on

|

R(g)

|

, here we define a new function class

called the loss class:

F = f

∈ G .

:

(x, y)

→

I

,g

{

g(x)

=

y

}

n i = 1 f(Z i ) , where Z

R(g)

1

Then it is clear that R(g)

=

E

[

f(Z)

]

, and

=

(X, Y ) and Z i =

(X i ,Y i ) .

22.3.2.1 Hoeffding's Inequality

One may have noticed that the relationship between R(g) and R(g) is actually the

relationship between expectation and average. According to the law of large num-

ber, when the number of samples approaches infinity, R(g) will converge to R(g) .

Furthermore, there is actually a quantitative version of the law of large number,

when the function g is bounded. This is the famous Hoeffding's Inequality.

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home