Geography Reference
In-Depth Information
to large values of the other. Put another way, in general an increase in one corresponds
to an increase in the other—it can be said that the variables are positively related to
one another. If, as the values in one variable increase, the values of the other variable
decrease (or vice versa) then the relationship is said to be negative. Analysis of
relationships ot en proceeds using correlation and regression, which enable explora-
tion of the nature of the relationship between two or more variables and the strength
of the relationship between them. Correlation and regression are explored in this
section.
In the case of two variables, regression is used to i t a line through the points on
a scatter plot, this line being as close as possible to all the points according to some
criterion. h is is called the line of best i t. h e line represents the trend in the data.
If the variables are positively related, the line will be low with respect to the z -axis
(representing small value) on the let of the graph and will increase diagonally
from let to right. h is would be the case for a line i tted to the plot in Figure 3.3.
h e correlation coei cient, r , provides a measure of the nature and strength of the
relationship between variables. More specii cally, it can be interpreted as indicating
the degree to which points scatter around the regression line. Before detailing the
measurement of correlation, the procedure for i tting a line to the scatter plot is
detailed.
In this example, the variable y (elevation) is the independent variable while the vari-
able z (snowfall) is the dependent variable. As well as allowing for exploration of the
nature of the relationship between two variables, regression enables prediction of the
values of dependent variables given values of independent variables. For example, if
we have a raster map of elevation values across a region (i.e. we have values at all
locations of interest) but only a few snowfall measurements, we could conduct
regression by taking elevation values at the snowfall measurement locations and plot-
ting these against the snowfall values. Once a line is i tted, the regression equation
(indicating the form of the i tted line) can be used to predict snowfall values at loca-
tions where there are no snowfall measurements because the regression equation tells
us what snowfall amounts to expect for any given value of elevation. h is process is
described below. h e regression equation can be given by:
ˆ i
z
=+
bb
y
(3.5)
0
1
i
h is indicates that the predicted value of z i (with a prediction indicated by the hat on
top of the letter)—the value given by the line of best i t—is obtained by adding
b 0 to b 1 multiplied by y i (in this example, the elevation value at location i ). b is upper
case Greek beta and these components are referred to as the beta coei cients. b 0 is
called the intercept and is the point where the line crosses the vertical axis (represent-
ing the z variable in Figure 3.3). b 1 is called the slope coei cient. A negative value
for the slope indicates a negative relationship and a positive value indicates a positive
relationship. In this case, we know b 1 will be positive as the scatter plot shows that
a higher elevation will correspond to a greater amount of snow. What is needed is
a method to identify appropriate values of b 0 and b 1 . Once we have these, we have
Search WWH ::




Custom Search