Graphics Reference
In-Depth Information
Solution
Use the
str()
function:
str(ToothGrowth)
'data.frame'
:
60
obs. of
3
variables:
$ len : num
4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
$ supp: Factor w
/
2
levels
"OJ"
,
"VC"
:
2 2 2 2 2 2 2 2 2 2 ...
$ dose: num
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
This tells us that
ToothGrowth
is a data frame with three columns,
len
,
supp
, and
dose
.
len
and
dose
contain numeric values, while
supp
is a factor with two levels.
Discussion
The
str()
function is very useful for finding out more about data structures. One common
source of problems is a data frame where one of the columns is a character vector instead of a
factor, or vice versa. This can cause puzzling issues with analyses or graphs.
When you print out a data frame the normal way, by just typing the name at the prompt and
pressing Enter, factor and character columns appear exactly the same. The difference will be re-
vealed only when you run
str()
on the data frame, or print out the column by itself:
tg
<-
ToothGrowth
tg$supp
<-
as.character(tg$supp)
str(tg)
'data.frame'
:
60
obs. of
3
variables:
$ len : num
4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
$ supp: chr
"VC" "VC" "VC" "VC"
...
$ dose: num
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
# Print out the columns by themselves
# From old data frame (factor)
ToothGrowth$supp
[
1
] VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC VC
[
26
] VC VC VC VC VC OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ
[
51
] OJ OJ OJ OJ OJ OJ OJ OJ OJ OJ
Levels: OJ VC
# From new data frame (character)
tg$supp