Information Technology Reference
In-Depth Information
excess error over the many training samples that Gregory might have
observed and, therefore, over many realizations of this prediction rule.
Because cross , jack , boot average over many realizations, they are, strictly
speaking, estimates of the expected excess error. Gregory, however, would
much rather know the excess error of his particular realization.
It is perhaps unfair to think of cross , jack , boot as estimators of the
excess error. A simple analogy may be helpful. Suppose X is an observa-
tion from the distribution F z , and T ( X ) estimates z. The bias is the
expected difference E [ T ( X ) - z] and is analogous to the expected excess
error. The difference T ( X ) - z is analogous to the excess error. Getting a
good estimate of the bias is sometimes possible, but getting a good esti-
mate of the difference T ( X ) - z would be equivalent to knowing z.
In the simulations, the underlying model was the logistic model that
assumes x 1 = ( t 1 , y 1 ),..., x n = ( t n , y n ) are independent and identically
distributed such that y i conditional on t i is Bernoulli with probability
of success q( t i ), where
r
r
r
r
r
r
() =+
logit q
t
b
t
b
,
(4.1)
i
0
i
where t i = ( t i 1 ,..., t ip ) is p -variate normal with zero mean and a specified
covariance structure S.
I performed two sets of simulations. In the first set (simulations 1.1,
1.2, 1.3) I let the sample sizes be, respectively, n = 20, 40, 60; the dimen-
sion of t i be p = 4; and
1000
00 0
0 1 0
0001
1
2
0
0
Ê
ˆ
Ê
ˆ
Á
Á
Á
˜
˜
˜
Á
Á
Á
˜
˜
˜
t
S=
,
b
=
0
,
b
=
,
(4.2)
0
t
Ë
¯
Ë
¯
where t = 0.80. We would expect a good prediction rule to choose vari-
ables t 1 and t 2 , and due to the correlation between variables t 2 and t 3 , a
prediction rule choosing t 1 and t 3 would probably not be too bad. In the
second set of simulations (simulations 2.1, 2.2, 2.3, the sample sizes were
again n = 20, 40, 60; the dimension of t i was increased to p = 6; and
100000
010000
001000
0001 0
000 10
000001
1
1
1
2
0
0
Ê
ˆ
Ê
ˆ
Á
Á
Á
Á
Á
Á
˜
˜
˜
˜
˜
˜
Á
Á
Á
Á
Á
Á
˜
˜
˜
˜
˜
˜
S=
,
b
=
0
,
b
=
.
(4.3)
0
t
t
Ë
¯
Ë
¯
Search WWH ::




Custom Search