Graphics Reference
In-Depth Information
Error bars that represent the standard error of the mean and confidence intervals serve the same
general purpose: to give the viewer an idea of how good the estimate of the population mean
is. The standard error is the standard deviation of the sampling distribution. Confidence intervals
are easier to interpret. Very roughly, a 95% confidence interval means that there's a 95% chance
that the true population mean is within the interval (actually, it doesn't mean this at all, but this
seemingly simple topic is way too complicated to cover here; if you want to know more, read up
on Bayesian statistics).
This function will perform all the steps of calculating the standard deviation, count, standard er-
ror, and confidence intervals. It can also handle NA s and missing combinations, with the na.rm
and .drop options. By default, it provides a 95% confidence interval, but this can be set with the
conf.interval argument:
summarySE <- function
function (data = NULL
NULL , measurevar, groupvars = NULL
NULL ,
conf.interval = .95 , na.rm = FALSE
FALSE , . drop = TRUE
TRUE ) {
require(plyr)
# New version of length that can handle NAs: if na.rm==T, don't count them
length2 <- function
function (x, na.rm = FALSE
FALSE ) {
iif (na.rm) sum(!is.na(x))
else
else
length(x)
}
# This does the summary
datac <- ddply(data, groupvars, . drop = . drop,
. fun = function
function (xx, col, na.rm) {
c( n = length2(xx[,col], na.rm = na.rm),
mean = mean
(xx[,col], na.rm = na.rm),
sd
= sd
(xx[,col], na.rm = na.rm)
)
},
measurevar,
na.rm
)
# Rename the "mean" column
datac <- rename(datac, c( "mean" = measurevar))
datac$se <- datac$sd / sqrt(datac$n) # Calculate standard error of the mean
# Confidence interval multiplier for standard error
# Calculate t-statistic for confidence interval:
# e.g., if conf.interval is .95, use .975 (above/below), and use
# df=n-1, or if n==0, use df=0
ciMult <- qt(conf.interval / 2 + .5 , datac$n - 1 )
Search WWH ::




Custom Search