Bivariate Statistics - MATLAB Recipes for Earth Sciences

Geoscience Reference

In-Depth Information

[r,p] = corrcoef(x,y)

r =

1.0000 0.9403

0.9403 1.0000

p =

1.0000 0.0000

0.0000 1.0000

In our example the p -value is close to zero suggesting that the correlation

coei cient is signii cant. We conclude from this experiment that this

particular signii cance test fails to detect correlations attributed to an outlier.

We therefore try an alternative t -test statistic to determine the signii cance

of the correlation between x and y . According to this test, we can reject the

null hypothesis that there is no correlation if the calculated t is larger than

the critical t ( n -2 degrees of freedom, ʱ =0.05).

tcalc = r(2,1) * ((length(x)-2)/(1-r(2,1)^2))^0.5

tcrit = tinv(0.95,length(x)-2)

tcalc =

14.8746

tcrit =

1.6991

h is result indeed indicates that we can reject the null hypothesis and therefore

there is no correlation. As an alternative to detecting outliers, resampling

schemes or surrogates such as the bootstrap or jackknife methods represent

powerful tools for assessing the statistical signii cance of the results. h ese

techniques are particularly useful when scanning large multivariate data sets

for outliers (see Chapter 9). Resampling schemes repeatedly resample the

original data set of n data points, either by choosing n -1 subsamples n times

(the jackknife), or by picking an arbitrary set of subsamples with n data

points with replacement (the bootstrap). h e statistics of these subsamples

provide better information on the characteristics of the population than the

statistical parameters (mean, standard deviation, correlation coei cients)

computed from the full data set. h e function bootstrp allows resampling of

our bivariate data set, including the outlier (x,y)=(20,20) .

rng(0)

rhos1000 = bootstrp(1000,'corrcoef',x,y);

h is command i rst resamples the data a thousand times; it then calculates

the correlation coei cient for each new subsample and stores the result in

the variable rhos1000 . Since corrcoef delivers a 2-by-2 matrix (as mentioned

Search WWH ::

Custom Search

Home