Geoscience Reference
In-Depth Information
1.0000 0.9563
0.9563 1.0000
h e function corrcoef calculates a matrix of Pearson's correlation coei cients
for all possible combinations of the two variables age and meters . h e value
of r =0.9563 suggests that the two variables age and meters are dependent on
each other.
Pearson's correlation coei cient is, however, highly sensitive to outliers,
as can be illustrated by the following example. Let us generate a normally-
distributed cluster of thirty data with a mean of zero and a standard deviation
of one. To obtain identical data values, we reset the random number generator
by using the integer 10 as seed.
clear
rng(10)
x = randn(30,1); y = randn(30,1);
plot(x,y,'o'), axis([-1 20 -1 20]);
As expected, the correlation coei cient for these random data is very low.
corrcoef(x,y)
ans =
1.0000 0.0302
0.0302 1.0000
Now we introduce a single outlier to the data set in the form of an exceptionally
high (x,y) value, in which x=y . h e correlation coei cient for the bivariate
data set including the outlier (x,y)=(5,5) is much higher than before.
x(31,1) = 5; y(31,1) = 5;
plot(x,y,'o'), axis([-1 20 -1 20]);
corrcoef(x,y)
ans =
1.0000 0.5022
0.5022 1.0000
Increasing the absolute (x,y) values for this outlier results in a dramatic
increase in the correlation coei cient.
x(31,1) = 10; y(31,1) = 10;
plot(x,y,'o'), axis([-1 20 -1 20]);
corrcoef(x,y)
Search WWH ::




Custom Search