Information Technology Reference
In-Depth Information
rule h ( t 0 ). Let Q ( y 0, h ( t 0 )) be the criterion that scores the discrepancy
between an observed value y 0 and its predicted value h ( t 0 ). The form of
both the prediction rule h and the criterion Q are given a priori. I define
the true error of h to be the expected error that h makes on a new
observation x 0 = ( t 0 , y 0 ) from F ,
F
F
F
F
F
= (
) =
ˆ ,
(
()
)
qqFF E Qy
,
h
t
.
ˆ
xF
~
0
F
0
0
In addition, I call the quantity
1
n
app = (
) =
ˆ ,
ˆ
Â
(
()
) =
(
()
)
ˆ
q
q FFEQy
,
h
t
Qy
,
h
t
ˆ
ˆ
ˆ
0
0
i
i
xF
-
F
F
n
0
i
=
1
the apparent error of h . The difference
F
(
) = (
) - (
)
ˆ ,
ˆ ,
ˆ ,
ˆ
RF F
qF F
qF F
is the excess error of h . The expected excess error is
F
(
)
ˆ ,,
rE RFF
FF
=
ˆ ~
F
where the expectation is taken over , which is obtained from x 1 ,..., x n
generated by F . In Section 4, I will clarify the distinction between excess
error and expected excess error. I will consider estimates of the expected
excess error, although what we would rather have are estimates of the
excess error.
I will consider three estimates (the bootstrap, the jackknife, and cross-
validation) of the expected excess error. The bootstrap procedure for esti-
mating r = E ~ F R (, F ) replaces F with . Thus
F
F
F
(
)
ˆ
*
ˆ
ˆ
r
boot =
ERF
,
F
,
ˆ ~ ˆ
*
FF
F
where * is the empirical distribution function of a random sample x * 1 ,...,
x n from . Since is known, the expectation can in principle be calcu-
lated. The calculations are usually too complicated to perform analytically,
however, so we resort to Monte Carlo methods.
F
F
F
F
1. Generate x 1 ,..., x n , a random sample from
. Let * be the
empirical distribution of x 1 ,..., x n .
2. Construct h * , the realized prediction rule based on x 1 ,...,
x n .
3. Form
F
Search WWH ::




Custom Search