The Recipes - Getting Started with R

Information Technology Reference

In-Depth Information

A major source of confusion is that suburbs[[1]] and suburbs[1] look similar but pro-

duce very different results:

suburbs[[1]]

This returns one column.

suburbs[1]

This returns a data frame, and the data frame contains exactly one column. This

is a special case of dfrm[c( n 1 , n 2 , ..., n k )] . We don't need the c( ... ) construct

because there is only one n .

The point here is that “one column” is different from “a data frame that contains one

column.” The first expression returns a column, so it's a vector or a factor. The second

expression returns a data frame, which is different.

R lets you use matrix notation to select columns, as shown in the Solution. But an odd

quirk can bite you: you might get a column or you might get a data frame, depending

which many subscripts you use. In the simple case of one index you get a column, like

this:

> suburbs[,1]

[1] "Chicago" "Kenosha" "Aurora" "Elgin"

[5] "Gary" "Joliet" "Naperville" "Arlington Heights"

[9] "Bolingbrook" "Cicero" "Evanston" "Hammond"

[13] "Palatine" "Schaumburg" "Skokie" "Waukegan"

But using the same matrix-style syntax with multiple indexes returns a data frame:

> suburbs[,c(1,4)]

city pop

1 Chicago 2853114

2 Kenosha 90352

3 Aurora 171782

4 Elgin 94487

5 Gary 102746

6 Joliet 106221

7 Naperville 147779

8 Arlington Heights 76031

9 Bolingbrook 70834

10 Cicero 72616

11 Evanston 74239

12 Hammond 83048

13 Palatine 67232

14 Schaumburg 75386

15 Skokie 63348

16 Waukegan 91452

This creates a problem. Suppose you see this expression in some old R script:

dfrm[,vec]

Does that return a column or a data frame? Well, it depends. If vec contains one value,

you get a column; otherwise, you get a data frame. You cannot tell from the syntax

alone.

Getting Started with R

Search WWH ::

Custom Search

Home