Geography Reference
In-Depth Information
three decimal places), r 2 = 0.856 2 = 0.732. In this case, the coei cient of determination
indicates that 73.2% of the variation in the data can be explained by the line of best i t.
In words, the model represents the relationship quite well. So, an r 2 value close to zero
indicates that the line (model) is a poor i t, while an r 2 value close to one indicates that
the model is a good i t. Of course, where the relationship is non-linear (e.g. the scatter
plot shows, along the horizontal axis, large values followed by small values, followed
by large values, in a 'V' shape) then r (and r 2 ) may be close to zero and the scatter plot
plays a key role in interpretation (Rogerson, 2006). h e regression and correlation
examples in this section are based on two variables (the dependent and an indepen-
dent variable). Regression can easily be expanded to include more than one indepen-
dent variable, thus allowing the assessment of the interrelationships between several
variables simultaneously. In the case of more than one independent variable, upper
case characters are used for the correlation coei cient and the coei cient of determi-
nation, thus R and R 2 .
Some forms of data (e.g. nominal or categorical variables) should not be analysed
directly using the methods outlined above. Alternative approaches are available in the
case of values that are constrained to be whole numbers. Percentages and proportions
should i rst be transformed before their analysis using standard statistical methods;
Aitchison (1986) details some appropriate methods.
h e topic of the following section is inferential statistics (i.e. statistical methods for
making inferences about a population from a sample as opposed to descriptive statis-
tics, which have been the focus in this section) and signii cance testing (e.g. testing for
the signii cance of the dif erences between groups). As an example, it is standard prac-
tice to ascertain the signii cance of regression coei cients or the correlation coei -
cient, and the testing of the latter is outlined below.
Inferential statistics
3.4
In this section, the focus is on statistical methods for making inferences about a popu-
lation from a sample as opposed to statistics which simply summarize a sample. Two
common tasks in inferential statistics contexts are (1) to consider the likelihood that a
statement about a given parameter (e.g. the mean or standard deviation) is true given
the available data and (2) to estimate the parameters (Brunsdon, 2008). h e i rst of
these relates to the concept of hypothesis testing while in the second the coni dence
interval is central.
A common objective in statistical inference is to compare sets of samples and assess
the degree of dif erence between the samples. In other words, we may be interested in
assessing the probability that two samples come from dif erent populations. Com-
parison of samples is based on tests of signii cance. In words, we test the signii cance
of the dif erence between two (or more) samples to assess if the dif erence between
them is likely to be 'real' in some sense. Questions of dif erences between samples are
usually phrased in terms of a null hypothesis and the alternative hypothesis, indicated
Search WWH ::




Custom Search