Geology Reference
In-Depth Information
Each principal component is a linear combination of the original variables and is
calculated in such a way that each successive component will account for a de
nite
amount of variation in X . The derivation of PCs is based on the covariance matrix
or correlation matrix of X . The correlation matrix-based approach has the upper
hand in dealing with real-world problems such as those in hydrology with variables
measured in different units. A good comparison between these two matrix-based
approaches can be found in Jolliffe [ 41 ]. The results from the above application of
PCA are normally presented in matrix form. The component weights, a ij ,de
ne the
component
p. The item Y (j) can be used
to compute component scores (the representation of X in the principal component
space). Y is a matrix of size n
'
s position in the space, a matrix of size p
×
p, which is comprised of these scores. In addition to
these matrices, variances of the PCs in increasing order form a matrix S of size
p
×
1. Thus PCA transforms a data set X by rotating the original axes of a
p-dimensional space and deriving a new set of axes (components) in such a manner
that the
×
first few
(q) components will retain most of the variation present in all of the original
variables (p) and thus an essential dimensionality reduction may be achieved by
projecting the original data on this new q-dimensional space, as long as, q
first axis accounts for the maximum variance. In most cases, the
p.
There is some debate amongst climatic modelers on deciding whether the
covariance or correlation matrix should be used. Overland and Priesendorfer [ 56 ]
used both the covariance and correlation matrix for PCA applied to cyclone fre-
quency data. The study found that the covariance matrix was best used for
fitting
data and locating individual variables which represented a large variance in the data
set, whereas the correlation matrix was best used to examine the spatial features.
Some researchers used the covariance matrix, since all variables were measured in
the same units [ 45 , 52 ]. There are many PCA-based hydrological research items in
the literature. Wigley et al. [ 80 ] assessed the temporal and spatial patterns of
precipitation data investigated by using PCA. In this study, precipitation data from
the period 1861
1970 collected from 55 stations spread throughout England and
Wales was subjected to PCA to de
-
ne areas of differential variability. This study
suggested that England and Wales could be divided into
five main regions of
precipitation variability; south
-
west. In Tabony [ 71 ], PCA was applied to precipitation data collected over a larger
area of 182 stations covering the whole of Europe for the period 1861
east, south
west, central, north
east, and north
-
-
-
-
1970. In
-
both of these studies, similarities were found for the
first two PCs but higher order
components differed to a larger degree in the second study.
PCA is useful in assessing redundancy due to the possible correlation of one
input to another in a modeling data set with large number of input data series. PCA
can be used to reduce the data input series into a smaller number of principal
components (arti
cial variables) that will account for most of the variance in the
actual data. The
first principal component is combination of original data series
used in the study, which explains the greatest amount of variation. The second
principal component de
nes the next largest amount of variation and is independent
 
Search WWH ::




Custom Search