Information Technology Reference
In-Depth Information
1.16 Creating a Scatter Plot
Problem
You have paired observations: (
x
1
,
y
1
), (
x
2
,
y
2
), ..., (
x
n
,
y
n
). You want to create a scatter
plot of the pairs.
Solution
If your data is held in two parallel vectors,
x
and
y
, use them as arguments of
plot
:
>
plot(x, y)
If your data is held in a (two-column) data frame, plot the data frame:
>
plot(dfrm)
Discussion
A scatter plot is usually my first attack on a new dataset. It's a quick way to see the
relationship, if any, between
x
and
y
. Creating the scatter plot is easy:
>
plot(x, y)
The
plot
function does not return anything. Rather, its purpose is to draw a plot of the
(
x, y
) pairs in the graphics window.
Life is even easier if your data is captured in a two-column data frame. If you
plot
a
two-column data frame, the function assumes you want a scatter plot created from
those two columns. The scatter plot shown in
Figure 1-3
was created by one call to
plot
:
>
plot(cars)
The
cars
dataset contains two columns:
speed
and
dist
. The first column is
speed
, so
that becomes the
x
-axis, and
dist
becomes the
y
-axis.
If your data frame contains more than two columns, you will get multiple scatter plots,
which might or might not be useful.
To get a scatter plot, your data must be numeric. Recall that
plot
is a polymorphic
function, so, if the arguments are nonnumeric, it will create some other type of plot.
See
Recipe 1.18
, for example, which creates box plots from factors.
See Also
See the help page for
plot
to learn more about adding a title, subtitle, and labels.