Information Technology Reference
In-Depth Information
P ( t ) 1
1
R L ( Y )=
t∈T
E Y |t [ L ( t, Y )] =
t∈{− 1 , 1 }
P ( t )
L ( t, y ) f Y |t ( y ) dy , (2.25)
if the absolute integrability condition (2.23) for L ( t, y ) is satisfied.
Applying again Theorem 2.1 to E = T
Y , the risk functional (2.25) is
finally expressed in terms of the error variable as
P ( t ) t +1
t− 1
R L ( E )=
t∈{− 1 , 1 }
L ( t, e ) f E|t ( e ) de .
(2.26)
y ) 2
= e 2
For MSE, L SE ( t, e )=( t
depends only on e (or, in more detail,
E T,E [ E 2 ], the second order moment
of the error, which is empirically estimated as in (2.5) and can be rewritten
as
e w = t
y w ). We then have R MSE ( E )=
y i ) 2 .
y i ) 2 +
t i =1
R MSE ( Y )= 1
n
( t i
( t i
(2.27)
t i = 1
Let us now consider the cross-entropy risk whose empirical estimate is given
by formula (2.16). For a two-class problem and the
-coding scheme one
obtains the following popularized expression, when the classifier has a single
output:
{
0 , 1
}
R CE ( Y )=
(1
t i )ln(1
y i )
t i ln( y i ) .
(2.28)
t i =0
t i =1
The
{−
1 , 1
}
-coding implies a y
( y +1) / 2 transformation; formula (2.28)
is then rewritten as
ln 1
ln 1+ y i
2
.
y i
R CE ( Y )=
(2.29)
2
t i = 1
t i =1
When multiplied by n , R CE ( Y ) can be viewed as the empirical estimate of
the following (theoretical) risk functional:
R CE ( Y )= − P ( 1) 1
ln(1 − y ) f Y |− 1 ( y ) dy−
1
P (1) 1
1
ln(1 + y ) f Y | 1 ( y ) dy +ln(2) .
(2.30)
Applying the same variable transformation as we did before, the CE risk
functional is finally expressed in terms of the error variable as
P ( t ) t +1
t− 1
ln 1
2
f E|t ( e ) de +ln(2) .
R CE ( E )=
t∈{− 1 , 1 }
(2.31)
te
 
Search WWH ::




Custom Search