Graphics Reference
In-Depth Information
Discussion
Another method is to calculate the standard error in the call ddply . It's not possible to refer to
the sd and n columns inside of the ddply call, so we'll have to recalculate them to get se . This
will do the same thing as the two-step version shown previously:
ddply(cabbages, c( "Cult" , "Date" ), summarise,
Weight = mean(HeadWt, na.rm = TRUE
TRUE ),
TRUE ),
n = sum(!is.na(HeadWt)),
se = sd / sqrtn) )
Confidence Intervals
Confidence intervals are calculated using the standard error of the mean and the degrees of free-
dom. To calculate a confidence interval, use the qt() function to get the quantile, then multiply
that by the standard error. The qt() function will give quantiles of the t-distribution when given
a probability level and degrees of freedom. For a 95% confidence interval, use a probability level
of .975 ; for the bell-shaped t-distribution, this will in essence cut of 2.5% of the area under the
curve at either end. The degrees of freedom equal the sample size minus one.
This will calculate the multiplier for each group. There are six groups and each has the same
number of observations (10), so they will all have the same multiplier:
sd = sd(HeadWt, na.rm = TRUE
ciMult <- qt( .975 , ca$n - 1 )
ciMult
# 2.262157 2.262157 2.262157 2.262157 2.262157 2.262157
Now we can multiply that vector by the standard error to get the 95% confidence interval:
ca$ci <- ca$se * ciMult
Cult Date Weight sd n se ci
c39 d16 3.18 0.9566144 10 0.30250803 0.6843207
c39 d20 2.80 0.2788867 10 0.08819171 0.1995035
c39 d21 2.74 0.9834181 10 0.31098410 0.7034949
c52 d16 2.26 0.4452215 10 0.14079141 0.3184923
c52 d20 3.11 0.7908505 10 0.25008887 0.5657403
c52 d21 1.47 0.2110819 10 0.06674995 0.1509989
We could have done this all in one line, like this:
ca$ci95 <- ca$se * qt( .975 , ca$n)
For a 99% confidence interval, use .995 .
Search WWH ::




Custom Search