Biology Reference
In-Depth Information
We can further investigate the characteristics of the lizards data frame with
the summary and dim functions.
> summary(lizards)
Species
Diameter
Height
Sagrei
:164
narrow:252
high:264
Distichus:245
wide :157
low :145
> dim(lizards)
[1] 409 3
From the output of str , summary ,and dim , we can see that the data frame
contains 409 observations and 3 variables named Species , Diameter ,and
Height . Each observation refers to a single lizard and describes its species (ei-
ther sagrei or distichus ) and the height and width of the branch it was perched on
when sighted. All the variables are categorical and therefore are stored as factors ;
the values they can assume can be listed with the levels function.
> levels(lizards[, "Species"])
[1] "Sagrei" "Distichus"
> levels(lizards[, "Height"])
[1] "high" "low"
> levels(lizards[, "Diameter"])
[1] "narrow" "wide"
An alternative, useful way of displaying these data is a contingency table, which
can be built using the table function.
> table(lizards[, c(3, 2, 1)])
, , Species = Sagrei
Diameter
Height narrow wide
high
86
35
low
32
11
, , Species = Distichus
Diameter
Height narrow wide
high
73
70
low
61
41
The order in which the two-dimensional contingency tables are listed depends on
the order of the variables in the data frame; in this case it is useful to have them split
by specie first, so the columns of lizards were rearranged appropriately.
Exploratory data analysis often includes some form of graphical data visualiza-
tion, especially when dealing with low-dimensional data sets such as the one we
Search WWH ::




Custom Search