Graphics Programs Reference
In-Depth Information
This method is especially useful during your data exploration phases. You
might have a dataset in front of you but have no clue where to start or what
it's about. If you don't know what the data is about, your readers won't
ei ther.
To tell a complete
story, you have to
understand your
data. The more
you know about
your data, the
better the story
you can tell.
The scatterplot matrix reads how you expect. It's usually a square grid
with all variables on both the vertical and horizontal. each column repre-
sents a variable on the horizontal axis, and each row represents a variable
on the vertical axis. This provides all possible pairs, and the diagonal is
left for labels because there's no sense in comparing a variable to itself.
CrEATE A SCATTErPLoT MATrIx
Now come back to your crime data. You have seven variables, or rates for
crime types, but in the previous example, you compared only two: murder
and burglary. With a scatterplot matrix, you can compare all the crime
types. Figure 6-9 shows what you're making.
To tell a complete
story, you have to
understand your
data. The more
you know about
your data, the
As you might expect, a lot of positive correlations can exist. The correla-
tion between burglary and aggravated assault, for example, seems to be
relatively high. As the former goes up, the latter also tends to increase,
and vice versa; however, the relationship between murder and larceny
theft is not so clear. You shouldn't make any assumptions, but it should be
easy to see how the scatterplot matrix can be useful. At first glance, it can
look confusing with all the lines and plots, but read from left to right and
top to bottom, and you can take away a lot of information.
better the story
you can tell.
Luckily, R makes creating a scatterplot matrix as easy as it is to make a
single scatterplot, albeit creation of the matrix is not as robust. Again, use
the plot() function, but instead of passing two columns, pass the whole
data frame, minus the first column because that's just state names.
plot(crime2[,2:9])
This gives you a matrix as shown in Figure 6-10, which is almost what you
want. It's still missing fitted curves though that can help you see relation-
ships a bit better.
Search WWH ::




Custom Search