Graphics Reference
In-Depth Information
Chapter15.Getting Your Data into
Shape
When it comes to making graphs, half the battle occurs before you call any graphing commands.
Before you pass your data to the graphing functions, it must first be read in and given the correct
structure. The data sets provided with R are ready to use, but when dealing with real-world data,
this usually isn't the case: you'll have to clean up and restructure the data before you can visual-
ize it.
Data sets in R are most often stored in data frames. They're typically used as two-dimensional
data structures, with each row representing one case and each column representing one variable.
Data frames are essentially lists of vectors and factors, all of the same length, where each vector
or factor represents one column.
Here's the heightweight data set:
library(gcookbook) # For the data set
heightweight
sex ageYear ageMonth heightIn weightLb
f
11.92
143
56.3
85.0
f
12.92
155
62.3
105.0
...
m
13.92
167
62.0
107.5
m
12.58
151
59.3
87.0
It consists of five columns, with each row representing one case: a set of information about a
single person. We can get a clearer idea of how it's structured by using the str() function:
str(heightweight)
'data.frame' : 236 obs. of 5 variables:
$ sex : Factor w / 2 levels "f" , "m" : 1 1 1 1 1 1 1 1 1 1 ...
$ ageYear : num 11.9 12.9 12.8 13.4 15.9 ...
$ ageMonth: int 143 155 153 161 191 171 185 142 160 140 ...
$ heightIn: num 56.3 62.3 63.3 59 62.5 62.5 59 56.5 62 53.8 ...
$ weightLb: num 85 105 108 92 112 ...
The first column, sex , is a factor with two levels, "f" and "m" , and the other four columns are
vectors of numbers (one of them, ageMonth , is specifically a vector of integers, but for the pur-
poses here, it behaves the same as any other numeric vector).
Search WWH ::




Custom Search