Graphics Reference
In-Depth Information
Solution
The most common way to read in a file is to use comma-separated values (CSV) data:
data
<-
read.csv(
"datafile.csv"
)
Discussion
Since data files have many different formats, there are many options for loading them. For ex-
ample, if the data file does nothave headers in the first row:
data
<-
read.csv(
"datafile.csv"
, header
=
FALSE
FALSE
)
The resulting data frame will have columns named
V1
,
V2
, and so on, and you will probably want
to rename them manually:
# Manually assign the header names
names(data)
<-
c(
"Column1"
,
"Column2"
,
"Column3"
)
You can set the delimiter with
sep
. If it is space-delimited, use
sep=" "
. If it is tab-delimited,
use
\t
, as in:
data
<-
read.csv(
"datafile.csv"
, sep
=
"\t"
)
By default, strings in the data are treated as factors. Suppose this is your data file, and you read
it in using
read.csv()
:
"First","Last","Sex","Number"
"Currer","Bell","F",2
"Dr.","Seuss","M",49
"","Student",NA,21
The resulting data frame will store
First
and
Last
as factors, though it makes more sense in this
case to treat them as strings (or charactersin R terminology). To differentiate this, set
string-
sAsFactors=FALSE
. If there are any columns that should be treated as factors, you can then
convert them individually:
data
<-
read.csv(
"datafile.csv"
, stringsAsFactors
=
FALSE
FALSE
)
# Convert to factor
data$Sex
<-
factor(data$Sex)
str(data)
'data.frame'
:
3
obs. of
4
variables:
$ First : chr
"Currer" "Dr." ""
$ Last : chr
"Bell" "Seuss" "Student"