Graphics Reference
In-Depth Information
Introduction
9.1
Graphical displays are oten constructed to place principal focus on the individual
observations in a dataset, and this is particularly helpful in identifying both the typ-
ical positions of datapoints and unusual or influential cases. However, in many in-
vestigations, principal interest lies in identifying the nature of underlying trends and
relationships between variables, and so it is oten helpful to enhance graphical dis-
playsinwayswhichgivedeeperinsightintothesefeatures.hiscanbeverybeneficial
both for small datasets, where variation can obscure underlying patterns, and large
datasets, where the volume of data is so large that effective representation inevitably
involves suitable summaries.
hese issues are particularly prominent in a regression setting, where it is the na-
ture of the relationships between explanatory variables and the mean value of a re-
sponse which is the focus of attention. Nonparametric smoothing techniques are ex-
tremely useful in this context as they provide an estimate of the underlying relation-
ship without placing restrictions on the shape of the regression function, apart from
an assumption of smoothness.
hisisillustratedinFig.
.
,wherethelet-handpaneldisplaysascatterplotof
data collected by the Scottish Environment Protection Agency on the level of dis-
solved oxygen close to the start of the Clyde estuary. Data from a substantial section
of the River Clyde are analysed in detail by McMullan et al. (
), who give the
background details. Water samples have been taken at irregular intervals over a long
period.hetop let-handpanelplots the data against time inyears. helarge amount
of variation in the plot against year makes it di
cult to identify whether any under-
lying trend is present. he top right-hand panel adds a smooth curve to the plot,
estimating the mean value of the response as a function of year. Some indication of
improvement in DO emerges, with the additional suggestion that this improvement
is largely restricted tothe earlier years. hesmooth curve therefore provides asignif-
icant enhancement of the display by drawing attention to features of some potential
importance whicharenotimmediately obvious fromaplotofthe rawdata. However,
these features required further investigation toseparate real evidence of change from
the effects of sampling variation.
In exploring the effect of an individual variable, it is also necessary to consider
the simultaneous effects of others. he lower let-hand panel shows the data plotted
against day of the year. Water samples are not taken every day but, when the samples
areplottedbydayoftheyearacrosstheentiretimeperiod,averyclearrelationship
is evident. his seasonal effect is a periodic one and so this should be reflected in
an appropriate estimate. he smooth curve added to the lower right panel has this
periodic property. It also suggests that a simple trigonometric shape may well be ad-
equatetodescribetheseasonaleffect.Onceasuitablemodelforthisvariablehasbeen
constructed, it will be advisable to reexamine the relationship between DO and year,
adjusted for the seasonal effect.
he aim of this chapter is to discuss the potential benefits of enhancing graphical
displays in this manner, and to illustrate the insights which this can bring to a vari-