Biology Reference
In-Depth Information
We can further investigate the characteristics of the
lizards
data frame with
the
summary
and
dim
functions.
> summary(lizards)
Species
Diameter
Height
Sagrei
:164
narrow:252
high:264
Distichus:245
wide :157
low :145
> dim(lizards)
[1] 409 3
From the output of
str
,
summary
,and
dim
, we can see that the data frame
contains 409 observations and 3 variables named
Species
,
Diameter
,and
Height
. Each observation refers to a single lizard and describes its species (ei-
ther
sagrei
or
distichus
) and the height and width of the branch it was perched on
when sighted. All the variables are categorical and therefore are stored as
factors
;
the values they can assume can be listed with the
levels
function.
> levels(lizards[, "Species"])
[1] "Sagrei" "Distichus"
> levels(lizards[, "Height"])
[1] "high" "low"
> levels(lizards[, "Diameter"])
[1] "narrow" "wide"
An alternative, useful way of displaying these data is a contingency table, which
can be built using the
table
function.
> table(lizards[, c(3, 2, 1)])
, , Species = Sagrei
Diameter
Height narrow wide
high
86
35
low
32
11
, , Species = Distichus
Diameter
Height narrow wide
high
73
70
low
61
41
The order in which the two-dimensional contingency tables are listed depends on
the order of the variables in the data frame; in this case it is useful to have them split
by specie first, so the columns of
lizards
were rearranged appropriately.
Exploratory data analysis often includes some form of graphical data visualiza-
tion, especially when dealing with low-dimensional data sets such as the one we
Search WWH ::
Custom Search