Information Technology Reference
In-Depth Information
mean(x)
median(x)
sd(x)
var(x)
cor(x, y)
cov(x, y)
Discussion
When I first opened the documentation for R, I began searching for material called
something like “Procedures for Calculating Standard Deviation.” I figured that such an
important topic would likely require a whole chapter.
It's not that complicated.
Standard deviation and other basic statistics are calculated by simple functions. Ordi-
narily, the function argument is a vector of numbers, and the function returns the
calculated statistic:
> x <- c(0,1,1,2,3,5,8,13,21,34)
> mean(x)
[1] 8.8
> median(x)
[1] 4
> sd(x)
[1] 11.03328
> var(x)
[1] 121.7333
The sd function calculates the sample standard deviation, and var calculates the sample
variance.
The cor and cov functions can calculate the correlation and covariance, respectively,
between two vectors:
> x <- c(0,1,1,2,3,5,8,13,21,34)
> y <- log(x+1)
> cor(x,y)
[1] 0.9068053
> cov(x,y)
[1] 11.49988
All these functions are picky about values that are not available (NA). Even one NA
value in the vector argument causes any of these functions to return NA, or even halt
altogether with a cryptic error:
> x <- c(0,1,1,2,3,NA)
> mean(x)
[1] NA
> sd(x)
[1] NA
 
Search WWH ::




Custom Search