Graphics Reference
In-Depth Information
Multiscale Visualization of Density Estimates
5.1.5
he importance of the smoothing parameter in the appearance of density estimates
has led to some additional visualization tools that examine adataset at a wide variety
of smoothing levels.
Recognizing that the mostvisually striking and important aspect of a density esti-
mate isoten the number andlocation of modes,Minnotte andScott ( )proposed
plottingthoseagainstthebandwidthforasetofkernelestimatesusingawiderangeof
smoothing parameters. As the number of modes is roughly monotone decreasing in
the bandwidth (strictly, if the normal kernel is used), the authors called the resulting
plot the “mode tree”. he mode tree is highly effective for examining modal behavior
under varying levels of smoothing for a dataset or density estimation method. Un-
less the modal behavior of the density estimation method is the effect of interest, the
normalkernelisrecommended.AsSilverman ( )showed,thenumberofmodesis
nonincreasing in h foranormal kernel density estimate. Visually, this means that the
mode tree is well behaved in this case, with mode traces appearing as you continue
down the tree and continuing once they appear. Minnotte and Scott ( ) demon-
strate that this is not the case for estimates derived from nonnormal kernels.
Minnotte andScott ( )alsodemonstrate howadditional features maybeadded
for an “enhanced modetree,” including locations of antimodes and inflection points,
measures of sizes of modes, and regions of positive and negative second derivatives
(the latter are oten called “bumps”). In this way, a great deal of information about
the behavior of the density estimates maybeexamined without restriction toa single
bandwidth, or even a small set of them.
Use of the filtered kernel technique of Marchette et al. ( ) leads to the “filtered
mode tree” of Marchette and Wegman ( ).hefiltering reduces the visual impor-
tanceofminormodesinthetail ofthedensity whileinflating theprominenceoflarge
central modes.
Overplottingthebumps(regionsofnegativesecondderivative)overdifferentband-
widths formultiple resamples, subsamples, orjitteredversions ofthedata leadstothe
“mode forest” of Minnotte et al. ( ).Again, the effect is a kind of visual inference,
emphasizing large central modes, while deemphasizing those minor ones in the tails
of the density.
Finally, Chaudhuri andMarron( )combinedtheideas ofthemodetree,“scale
space”fromcomputervision research(whichiscloselyrelatedtosmoothing parame-
tervariation),andsomesimpleinferencetoproposeSiZer(forSignificantZerocross-
ings). At each location-by-bandwidth pixel, one of three colors is plotted depending
on the estimated density slope - significantly positive, significantly negative, or not
significantly different fromzero. he resulting patterns may be examined for modal-
ity information.
Figure . shows the mode tree, filtered mode tree, subsampled mode forest, and
SiZer plot for the minimum temperature data of Fig. . . For the latter, increasingly
dark grey levels indicate significantly positive slope,nonsignificant slope, and signif-
icantly negative slope, while white identifies regions of insu cient data density for
inference. Note the emphasis on the bimodal structure in each of the four plots, al-
Search WWH ::




Custom Search