Bivariate Statistics - MATLAB Recipes for Earth Sciences

Geoscience Reference

In-Depth Information

ans =

1.0000 0.7981

0.7981 1.0000

and reaches a value close to r =1 if the outlier has a value of (x,y)=(20,20) .

x(31,1) = 20; y(31,1) = 20;

plot(x,y,'o'), axis([-1 20 -1 20]);

corrcoef(x,y)

ans =

1.0000 0.9403

0.9403 1.0000

We can compare the sensitivity of Pearson's correlation coei cient with that

of Spearman's correlation coei cient and Kendall's correlation coei cient

using the function corr . In contrast to corrcoef , this function does not

calculate correlation matrices that we can later use (e.g., in Chapter 9) for

calculating correlations within multivariate data sets. We type

r_pearson = corr(x,y,'Type','Pearson')

r_spearman = corr(x,y,'Type','Spearman')

r_kendall = corr(x,y,'Type','Kendall')

which yields

r_pearson =

0.9403

r_spearman =

0.1343

r_kendall =

0.0753

and observe that the alternative measures of correlation result in reasonable

values, in contrast to the absurd value for Pearson's correlation coei cient

that mistakenly suggests a strong interdependency between the variables.

Although outliers are easy to identify in a bivariate scatter, erroneous values

can easily be overlooked in large multivariate data sets (Chapter 9).

Various methods exist to calculate the signii cance of Pearson's correlation

coei cient. h e function corrcoef also includes the possibility of evaluating

the quality of the result. h e p -value is the probability of obtaining a

correlation as large as the observed value by random chance, when the true

correlation is zero. If the p -value is small, then the correlation coei cient r

is signii cant.

Search WWH ::

Custom Search

Home