Information Technology Reference
In-Depth Information
A major source of confusion is that suburbs[[1]] and suburbs[1] look similar but pro-
duce very different results:
suburbs[[1]]
This returns one column.
suburbs[1]
This returns a data frame, and the data frame contains exactly one column. This
is a special case of dfrm[c( n 1 , n 2 , ..., n k )] . We don't need the c( ... ) construct
because there is only one n .
The point here is that “one column” is different from “a data frame that contains one
column.” The first expression returns a column, so it's a vector or a factor. The second
expression returns a data frame, which is different.
R lets you use matrix notation to select columns, as shown in the Solution. But an odd
quirk can bite you: you might get a column or you might get a data frame, depending
which many subscripts you use. In the simple case of one index you get a column, like
this:
> suburbs[,1]
[1] "Chicago" "Kenosha" "Aurora" "Elgin"
[5] "Gary" "Joliet" "Naperville" "Arlington Heights"
[9] "Bolingbrook" "Cicero" "Evanston" "Hammond"
[13] "Palatine" "Schaumburg" "Skokie" "Waukegan"
But using the same matrix-style syntax with multiple indexes returns a data frame:
> suburbs[,c(1,4)]
city pop
1 Chicago 2853114
2 Kenosha 90352
3 Aurora 171782
4 Elgin 94487
5 Gary 102746
6 Joliet 106221
7 Naperville 147779
8 Arlington Heights 76031
9 Bolingbrook 70834
10 Cicero 72616
11 Evanston 74239
12 Hammond 83048
13 Palatine 67232
14 Schaumburg 75386
15 Skokie 63348
16 Waukegan 91452
This creates a problem. Suppose you see this expression in some old R script:
dfrm[,vec]
Does that return a column or a data frame? Well, it depends. If vec contains one value,
you get a column; otherwise, you get a data frame. You cannot tell from the syntax
alone.
 
Search WWH ::




Custom Search