Graphics Reference
In-Depth Information
Scatterplot Matrices: splom and xysplom
6.3
Ascatterplotmatrix( splom ) is a trellis display in which the panels are defined by
a Cartesian product of variables. In the standard scatterplot matrix constructed by
splom , the same set of variables defines both the rows and columns of the matrix.
More generally, what we term an xysplom uses different sets of variables defining
the rows and columns of the matrix. We strongly recommend the use of sploms and
xysploms, sometimes conditioned on values of relevant categorical variables, as ini-
tial steps in analyzing a set of data. Examples are given in Sects. . . - . . .
An xysplom ,producedwithourfunction xysplom (Heiberger and Holland,
), is used to produce a rectangular subset, oten an off-diagonal block, of a scat-
terplot matrix. It involves a Cartesian product of the form [variables]
[variables],
where the two sets of variables contain no common elements. A large splom may
be legibly presented as a succession of smaller sploms (diagonal blocks of the large
splom) and xysploms (off-diagonal blocks of the large splom).
An example where xysploms are useful in their own right is when examining a set
ofpotential responsevariables againstmembersofasetofpotential explanatory vari-
ables.
We use an extension
( . )
u+v~w+x+y+z|a*b
of the syntax of the standard model formula to define the variables of the xysplom
function. he rows of the xysplom aredefinedbythecrossingofthesetofvari-
ables on the let-hand side of the formula with the set of variables on the right-hand
side of the formula. he expanded xysplom generated with Eq. ( . ) will contain, for
each combination of the elements in a and b , an xysplom having two rows and four
columns.
Example - Life Expectancy
6.3.1
For each of the largest countries in the world (according to population fig-
ures), data are given for a country's life expectancy at birth categorized by gender,
number of people per television set, and number of people per physician. his is
a subset of the full data set contained in a study cited by Rossman ( )that sought
a short list of variables that could accurately predict life expectancy:
life.exp: Life expectancy at birth
ppl.per.tv: Number of people per television set
ppl.per.phys: Number of people per physician
Figure . is a scatterplot matrix for a final linear model for these data. he variables
ppl.per.tv and ppl.per.phys were log-transformed to correct for positive
skewnessintheoriginal data. hisfiguredemonstrates thattheresponse life.exp
is moderately negatively correlated with both explanatory variables, and the two ex-
planatory variables are positively associated.
Search WWH ::




Custom Search