Graphics Reference
In-Depth Information
Choice of kernel is relatively unimportant, as any reasonable p.d.f. will provide
a similar picture. Optimal mean integrated squared error may be achieved using the
Epanechnikov kernel, a truncated parabola with form
t
K E
(
t
)=
(
t
.
( . )
he discontinuous derivative at
is inherited by the estimate at numerous points,
soforvisualpurposesthischoiceofkernelmaynotbeideal.hebiweightkernel,
t
K B
(
t
)=
(
)
t
,
( . )
is nearly as e cient and has a continuous first derivative, so it is oten preferred for
the smoother appearance of its estimates. (he triweight kernel, K T
x
(
)=
(
t
, has a continuous second derivative.) Finally, the standard normal kernel,
)
K N
,is smoother still and oten preferred for its infinite number of contin-
uous derivatives and its uniquely well-behaved mode structure (Sect. . . ).
Choiceofthebandwidth h ismorecriticaland,aswithahistogram,canhaveavery
strong effect on the resulting estimate. Large h leads to oversmoothing and (asymp-
totic) biasproportionalto h f ′′
t
ϕ
t
(
)=
(
)
,whilesmallh leadstoanundersmoothed,highly
multimodal estimatewith(againasymptotic)variance proportionalto f
(
x
)
nh.Op-
timization shows that the asymptotic mean integrated squared error may be min-
imized by choosing h
(
x
)
,withc K depending on the kernel and
equaling . , . , and . for the Epanechnikov, biweight, and normal kernels,
respectively. As with the histogram, the presence of a function of f in the optimal
choice of h leads to a requirement of alternative approximations.
One approach is again to assume a normal distribution for f .hisleadstothe
normal reference rule h
f ′′
=
c K
[
R
(
)
n
]
c NK σn ,forc NK
. , . ,and . forthe Epanech-
nikov, biweight, and normal kernels. his will be oversmoothed for most nonnormal
densities, soan initial estimate with extreme skewness or multiple strong modesmay
argue for a second estimate with smaller h. Figure . demonstrates the effect of h
with three normal kernel estimates for a subset of points of the minimum tem-
perature data from Fig. . . he normal reference rule suggests that h
=
=
. for this
data, but the strong bimodality indicates that a smaller choice, such as the middle
h
=
. example, is probably more appropriate. Taking h smaller still, as in the right-
most h
=
. estimate of Fig. . , leads to a clearly undersmoothed estimate. Note
that the individual kernels at the bottom of each plot are scaled correctly relative
to one another, but not to their full estimates. For visibility, each has been rescaled
vertically to times the height of a single kernel used for this sample.
A variety of more highly computational approaches for data-based bandwidth se-
lection have also been proposed. hese include variations on cross-validation [in-
cluding Scott and Terrell ( ), Hall and Marron ( ), and Sain et al. ( )] as
well as methods using pilot estimates of R
=
f ′′
[such as Sheather and Jones ( )
andHalletal.( )].Whilethesecanleadtoimprovedchoiceofh and accordingly
(
)
Search WWH ::




Custom Search