Graphics Reference
In-Depth Information
As mentioned earlier, hexagonal binning was introduced as way of improving the
visual appeal of bivariate histograms that are displayed using some sort of glyph to
representthe frequency in eachbin. Analternative istousegreyscale orcolor coding
of the bins to represent frequency. An example using the PRISM data is shown in
Fig. . . he number of hexagon bins was chosen to be consistent with the number
of bins in the rectangular mesh of the histogram in Fig. . . he hexagons appear to
further clarify the multimodal structure in the data.
Bivariate Kernel Density Estimators
5.2.2
Bivariate kernel estimates are more common than bivariate histograms. Basic com-
putationisasimpleextensionoftheunivariatemethod.Choiceofthekernelfunction
is a little more complicated but may usually be satisfied with a bivariate kernel gen-
erated from one of the standard univariate kernels.
Bandwidth selection also becomes moretroublesome, asa
bandwidth matrix
isnowrequired.Productkernels(diagonalbandwidth matrix)thatallowfordifferent
degrees of smoothing in each dimension are appropriate for most datasets, and oten
transformations of the original data are used to make the data more amenable to
a more simple form of the kernel. Wand and Jones ( )give an excellent discussion
of the issues with parameterizing the bandwidth matrix for bivariate kernel density
estimators. he exact form of the bivariate kernel estimator follows from the more
general multivariate presentation at the beginning of Sect. . .
Scott ( ) demonstrates that the optimal smoothing parameter for the product
kernel estimator isproportional to n .Further,foruncorrelated andnormally dis-
tributed data, the asymptotic MISE bandwidth is given as h
k
σ k n for
=(
)
k
, and σ k the standard deviation for the kth variable.
Displaying bivariate density estimates can be accomplished easily using contour
plots or via -D perspective or wireframe plots. An example is shown in Fig. . ,
again using the Colorado PRISM data. he let frame shows a contour plot where
each contour represents the points of equal height of the density. he sharp mode
corresponding to the eastern plains is clearly visible, while there appears to be even
further evidence of multimodal structure.
he right plot in Fig. . shows a perspective or wireframe plot of the estimated
density. Note that the density has been rotated in the figure in an attempt to better
display the structure in the data. Clearly additional modesare seen, suggesting more
structureinthedatathansimplytheroughdivision ofthestateintotheeasternplains
and mountain regions. Of course, densities of this kind (multimodal with differing
scales) pose a special challenge to density estimators with fixed meshes or band-
widths. Variable bandwidth methods for kernel estimators have been proposed for
bivariate and higher-dimensional data; see Terrell and Scott ( ) and Sain ( ).
Whether to view such plots using -D perspective plots or -D contour plots is
oten a matter of personal taste. One view is that the perspective plot is more useful
for obtaining an overview of the structure, while a contour plot is more useful for
obtaining precise information such as the location of a mode.
=
Search WWH ::




Custom Search