Database Reference
In-Depth Information
Figure 3.7
Anscombe's quartet visualized as scatterplots
The R code for generating
Figure 3.7
is shown next. It requires the R package
ggplot2
[11], which can be installed simply by running the command
install.packages("ggplot2")
. The
anscombe
dataset for the plot is
included in the standard R distribution. Enter
data()
for a list of datasets
included in the R base distribution. Enter
data(
DatasetName
)
to make a dataset
available in the current workspace.
In the code that follows, variable
levels
is created using the
gl()
function,
which generates factors of four levels (1, 2, 3, and 4), each repeating 11 times.
Variable
mydata
is created using the
with(
data
,
expression
)
function,
which evaluates an
expression
in an environment constructed from
data
. In
this example, the
data
is the
anscombe
dataset, which includes eight attributes:
x1
,
x2
,
x3
,
x4
,
y1
,
y2
,
y3
, and
y4
. The
expression
part in the code creates a
data frame from the
anscombe
dataset, and it only includes three attributes:
x
,
y
,
and the group each data point belongs to (
mygroup
).
install.packages("ggplot2")
# not required if package has
been installed
data(anscombe)
# load the anscombe dataset into the current
workspace
anscombe
x1 x2 x3 x4 y1 y2 y3 y4