Getting Your Data into Shape - R Graphics

Graphics Reference

In-Depth Information

Summarizing Data by Groups

Problem

You want to summarize your data, based on one or more grouping variables.

Solution

Use ddply() from the plyr package with the summarise() function, and specify the operations

to do:

library(MASS) # For the data set

library(plyr)

ddply(cabbages, c( "Cult" , "Date" ), summarise, Weight = mean(HeadWt),

VitC = mean(VitC))

Cult Date Weight VitC

c39 d16

3.18 50.3

c39 d20

2.80 49.4

c39 d21

2.74 54.8

c52 d16

2.26 62.5

c52 d20

3.11 58.9

c52 d21

1.47 71.8

Discussion

Let's take a closer look at the cabbages data set. It has two factors that can be used as grouping

variables: Cult , which has levels c39 and c52 , and Date , which has levels d16 , d20 , and d21 . It

also has two numeric variables, HeadWt and VitC :

cabbages

Cult Date HeadWt VitC

c39 d16

2.5

51

c39 d16

2.2

55

...

c52 d21

1.5

66

c52 d21

1.6

72

Finding the overall mean of HeadWt is simple. We could just use the mean() function on that

column, but for reasons that will soon become clear, we'll use the summarise() function in-

stead:

R Graphics

Search WWH ::

Custom Search

Home