Geoscience Reference
In-Depth Information
10
0.25
9
8
0.2
7
0.15
6
5
0.1
4
3
0.05
2
1
−4
−2
0
2
4
2
4
6
8
Normal N(0,1)
sqrt(ProtectedAreas)
Fig. 3.6
Inspection of the variable ProtectedAreas using QQ-plot and PDE
(Kolmogorov-Smirnov, Chi2, Jarque-Bera, etc.) and/or visual checks such as the
QQ-plots (cf. Fig. 3.1 , where the validation step is highlighted in red).
Sometimes initial classifications may be discovered by this description step (cf.
Fig. 3.1 ). One example is the detection of possible subclasses of SealedSurface as
described above. Another important aim is to identify useful nonlinear transfor-
mations, such as log or sqrt in order to enable the comparison of distributions of
variables (cf. Fig. 3.9 below). The nonlinear transformations applied to the UD data
are given in Appendix 1 .
3.3.3
Looking for Correlation Structures
After selecting a normalizing nonlinear transformation, it is useful to identify
correlations among the variables. Two typical methods for this are scatterplots and
the calculation of correlation measures.
If nonlinear transformations are first applied to the data, then linear correlation
measures such as the Pearson correlation coefficient can be used. Otherwise rank-
based correlation measures such as the Spearman correlation coefficient or Kendall's
Tau must be used. Figure 3.7 shows a matrix of all pairwise scatterplots. It can
be seen that some variables are highly correlated. For example, SealedSurface
and LandConsumption are strongly positively correlated, while BuildingArea and
SettlementDensity show strong negative correlation. Figure 3.8 visualizes the
Pearson correlation coefficient of the transformed data.
 
Search WWH ::




Custom Search