The Recipes - Getting Started with R

Information Technology Reference

In-Depth Information

1.16 Creating a Scatter Plot

Problem

You have paired observations: ( x 1 , y 1 ), ( x 2 , y 2 ), ..., ( x n , y n ). You want to create a scatter

plot of the pairs.

Solution

If your data is held in two parallel vectors, x and y , use them as arguments of plot :

> plot(x, y)

If your data is held in a (two-column) data frame, plot the data frame:

> plot(dfrm)

Discussion

A scatter plot is usually my first attack on a new dataset. It's a quick way to see the

relationship, if any, between x and y . Creating the scatter plot is easy:

> plot(x, y)

The plot function does not return anything. Rather, its purpose is to draw a plot of the

( x, y ) pairs in the graphics window.

Life is even easier if your data is captured in a two-column data frame. If you plot a

two-column data frame, the function assumes you want a scatter plot created from

those two columns. The scatter plot shown in Figure 1-3 was created by one call to plot :

> plot(cars)

The cars dataset contains two columns: speed and dist . The first column is speed , so

that becomes the x -axis, and dist becomes the y -axis.

If your data frame contains more than two columns, you will get multiple scatter plots,

which might or might not be useful.

To get a scatter plot, your data must be numeric. Recall that plot is a polymorphic

function, so, if the arguments are nonnumeric, it will create some other type of plot.

See Recipe 1.18 , for example, which creates box plots from factors.