Graphics Programs Reference
In-Depth Information
variables, meters and age . This trend can be described by Pearson·s cor-
relation coeffi cient r , where r =1 stands for a perfect positive correlation, i.e.,
age increases with meters , r =0 suggests no correlation, and r =-1 indicates
a perfect negative correlation. We use the function corrcoef to compute
Pearson·s correlation coeffi cient.
corrcoef(meters,age)
which causes the output
ans =
1.0000 0.9342
0.9342 1.0000
The function corrcoef calculates a matrix of correlation coeffi cients
for all possible combinations of the two variables. The combinations
(meters, age) and (age, meters) result in r =0.9342, whereas
(age, age) and (meters, meters) yield r =1.000.
The value of r =0.9342 suggests that the two variables age and meters
depend on each other. However, Pearson·s correlation coeffi cient is highly
sensitive to outliers. This can be illustrated by the following example. Let us
generate a normally-distributed cluster of thirty (x,y) data with zero mean
and standard deviation one. In order to obtain identical data values, we reset
the random number generator by using the integer 5 as seed.
randn('seed',5);
x = randn(30,1); y = randn(30,1);
plot(x,y,'o'), axis([-1 20 -1 20]);
As expected, the correlation coeffi cient of these random data is very low.
corrcoef(x,y)
ans =
1.0000 0.1021
0.1021 1.0000
Now we introduce one single outlier to the data set, an exceptionally high
(x,y) value, which is located precisely on the one-by-one line. The correla-
tion coeffi cient for the bivariate data set including the outlier (x,y)=(5,5)
is considerably higher than before.
x(31,1) = 5; y(31,1) = 5;
plot(x,y,'o'), axis([-1 20 -1 20]);
corrcoef(x,y)
Search WWH ::




Custom Search