Graphics Reference
In-Depth Information
Summarizing Data with Standard Errors and
Confidence Intervals
Problem
You want to summarize your data with the standard error of the mean and/or confidence inter-
vals.
Solution
Getting the standard error of the mean involves two steps: first get the standard deviation and
count for each group, then use those values to calculate the standard error. The standard error for
each group is just the standard deviation divided by the square root of the sample size:
library(MASS) # For the data set
library(plyr)
ca <- ddply(cabbages, c( "Cult" , "Date" ), summarise,
Weight = mean(HeadWt, na.rm = TRUE
TRUE ),
sd = sd(HeadWt, na.rm = TRUE
TRUE ),
n = sum(!is.na(HeadWt)),
se = sd / sqrt(n))
ca
Cult Date Weight sd n se
c39 d16
3.18 0.9566144 10 0.30250803
c39 d20
2.80 0.2788867 10 0.08819171
c39 d21
2.74 0.9834181 10 0.31098410
c52 d16
2.26 0.4452215 10 0.14079141
c52 d20
3.11 0.7908505 10 0.25008887
c52 d21
1.47 0.2110819 10 0.06674995
NOTE
In versions of plyr before 1.8, summarise() created all the new columns simultaneously, so you would
have to create the se column separately, after creating the sd and n columns.
Search WWH ::




Custom Search