Information Technology Reference
In-Depth Information
E
X
] (see e.g., [136]). This is the usual regression solution of
Z
predicted
by
X
. One of the reasons why the MMSE estimate
Y
is so praised in regression
problems, is that it is the optimal one — affords the minimum
[
Z
|
Y
)]
for a class of convex, symmetric, and unimodal loss functions — when
g
(
X
)
is linear and
X
and
ξ
are Gaussian [208, 88]. Furthermore, when the noise is
independent of
X
and has zero mean, the conditional expectation factors out
as
Y
=
E
[
L
(
Z
−
E
[
g
(
X
)
|
X
]+
E
[
ξ
(
X
)
|
X
]=
g
(
X
). One is then able to retrieve
g
(
X
)
from
Z
.
For classification problems the MMSE solution also enjoys important prop-
erties. Instead of deriving these properties from the regression setting (apply-
ing the above
Z
=
g
(
X
)+
ξ
(
X
) model to classification raises mathematical
diculties), they can be derived [83, 185, 26, 252] by first observing that the
empirical MSE risk,
R
MSE
, for a classifier with
c
target values
t
k
and outputs
y
k
is written as
c
n
k
1
n
R
MSE
=
y
k
(
x
i
))
2
,
(
t
ik
−
(2.6)
k
=1
i
=1
where
n
k
is the number of instances of class
ω
k
and each
y
k
depends on the
parameter vector
w
.For
n
→∞
, and after some mathematical manipulations,
one obtains:
c
R
MSE
(
E
[
T
k
|x
]
− y
k
(
x
))
2
f
X|t
(
x
)
dx
+
→
n→∞
R
MSE
=
X|T
k
=1
c
E
x
]
f
X|t
(
x
)
dx .
[
T
k
|
2
[
T
k
|
+
x
]
−
E
(2.7)
X|T
k
=1
The second term of (2.7) represents a variance of the
t
k
and does not depend
on parameter tuning. Thus, the minimization of
R
MSE
for
n
implies the
minimization of the first term of (2.7). In optimal conditions (to be mentioned
shortly), that amounts to obtaining
→∞
y
k
(
x
)=
E
[
T
k
|
x
]
.
(2.8)
This result is the version for the classification setting of the general result
(
Y
=
X
]) previously mentioned for the regression setting. Expression
(2.8) can be written out in detail as:
E
[
Z
|
n
y
k
(
x
)=
E
[
T
k
|
x
]=
t
i
P
(
T
k
=
t
i
|
x
);
(2.9)
i
=1
implying, for a 0-1 coding scheme of the
t
i
,
y
k
(
x
)=
P
(
T
k
|
x
)
.
(2.10)