Biology Reference
In-Depth Information
Setting the
header
argument to
TRUE
tells
read.table
that the first line of the
file
lizards.txt
contains the variable names. Each observation must be writ-
ten in a single line, and the values assumed by the variables for that observation
correspond to the fields (separated by spaces or tabulations) present in that line.
In both cases, the data is stored in a
data frame
called
lizards
, whose structure
can be examined with the
str
function.
> str(lizards)
'data.frame': 409 obs. of 3 variables:
$ Species : Factor w/ 2 levels "Sagrei","Distichus":
1 ...
$ Diameter: Factor w/ 2 levels "narrow","wide": 1 ...
$ Height : Factor w/ 2 levels "high","low": 2 2 ...
Like most programming languages,
R
defines a large set of
classes
of objects, which
represent and provide an interface to different types of variables. Some of these
classes correspond to various kinds of variables used in statistical modeling:
•
Logical
: indicator variables, e.g., either
TRUE
or
FALSE
•
Integer
: natural numbers, e.g., 1
,
2
,...,
n
∈
N
,
π
,
√
2
•
Numeric
: real numbers, such as 1
.
2
•
Character
: character strings, such as
"a"
,
"b"
,
"c"
•
Factor
: categorical variables, defined over a finite set of
levels
identified by char-
acter strings
•
Ordered
: ordered categorical variables, similar to factors but with an explicit
ordering of the levels, e.g.,
"LOW" < "AVERAGE" < "HIGH"
.
Other classes correspond to more complex data types, such as multidimensional or
heterogeneous data:
•
List
: a collection of arbitrary objects, often belonging to different classes
•
Vector
: a mathematical vector of elements belonging to the same class (i.e.,
all integers, all factors with the same levels, etc.) with an arbitrary number of
dimensions
•
Matrix
: a matrix (i.e., a 2-dimensional vector) of elements belonging to the same
class
•
Data frame
: a list of objects with the same length but possibly different classes.
It is usually displayed and manipulated in the same way as a matrix.
As we can see from the output of
str
,
read.table
saves the data read from
lizards.txt
in a data frame to allow each variable to be stored as an object
of the appropriate class. The labels of the possible values of each variable, which
are character strings in
lizards.txt
, are automatically used as levels and the
variables converted to factors.
Search WWH ::
Custom Search