Information Technology Reference
In-Depth Information
π
y
π
π
y
π
π
y
π
⎛
⎝
⎞
⎠
⎛
⎝
⎞
⎠
B
C
B
C
A
B
C
B
A
C
C
C
incorrect
=⇒
remove
A
correct
=⇒
remove
B
Fig. 5.1
Modeling ranking as a sequence of classifications
either. Overall speaking, the scoring function makes one error in this sequence of
classification tasks.
By further assigning a non-negative weight
β(t) (t
1
)
to the classi-
fication task at the
t
th step, which represents its importance to the entire sequence,
we will get the weighted sum of the classification errors of all individual tasks,
=
1
,...,m
−
β(t)
1
.
m
−
1
m
L
β
(f
;
x
,π
y
)
−
I
{
f(x
π
−
1
(t)
)>f (x
π
−
1
(i)
)
}
(5.7)
t
=
1
i
=
t
+
1
Then we define the minimum value of the weighted sum over all the permutations
in
Ω
y
as the
essential loss for ranking
.
L
β
(f
;
x
,Ω
y
)
=
min
π
y
∈
L
β
(f
;
x
,π
y
).
(5.8)
Ω
y
It has been proven in [
3
] that the essential loss is an upper bound of both
(1
MAP), as shown in the following theorem. As a result, the
minimization of it will lead to the effective maximization of NDCG and MAP.
−
NDCG) and (1
−
Theorem 5.1
Suppose there are m
k
documents with rating k and
i
=
k
∗
m
i
>
0
(
when computing MAP
,
labels smaller than k
∗
are regarded as irrelevant
),
then
∀
f
,
the following inequalities hold
,
1
Z
m
(
1
)
1
−
NDCG
(π,
y
)
≤
L
β
1
(f
;
x
,Ω
y
),
where β
1
(t)
=
G(l
π
−
y
(t)
)η(t),
∀
π
y
∈
Ω
y
;
1
i
=
k
∗
m
i
(
2
)
1
−
MAP
(π,
y
)
≤
L
β
2
(f
;
x
,Ω
y
),
where β
2
(t)
≡
1
.
As compared to the bounds for the pointwise approach as given in the previous
section, one can see that the essential loss has a nicer property. When (1
−
NDCG)
or (1
MAP) is zero, the essential loss is also zero. In other words, the zero value
of the essential loss is not only a sufficient condition but also a necessary condition
of the zero values of (1
−
MAP).
Furthermore, it has been proven in [
3
] that the essential loss is the lower bound
for many pairwise loss functions.
−
NDCG) and (1
−