Information Technology Reference
In-Depth Information
and P consist of orthogonal and orthonormal vectors (Rajalahti and
Kvalheim, 2011):
X = TP T + E = t 1 p T 1 + t 2 p T 2 + . . . + t A p T A + E
[4.4]
If X is an M × N matrix that consists of M samples and N variables
(columns), then T is an M × A matrix and P T is an A × N matrix, where A
is the number of PCs. E is an M × N matrix containing residuals, that is,
variance not explained by PCs (Rajalahti and Kvalheim, 2011). Matrix X
is decomposed to the sum of products of score t
a and loading p a vectors,
where a = 1, 2, . . ., A. Constraint is such as that weight vector w a is equal
to the loadings p a . Once the fi rst PC (latent variable) is calculated, it is
subtracted from data matrix, X a t a p a T , and the next PCs are calculated.
Usually only several fi rst PCs, which explain most of the data variance, are
calculated and the rest of the noise is left in residuals. Therefore,
information contained in the fi rst PC is more signifi cant than in the second,
and the second component is more signifi cant than the third, and so on
(Massart and Buydens, 1988). PCA is especially useful for data presentation
(visualization), since the score plots reveal patterns, such as clusters,
trends, and outliers, in the data. Loading plots reveal covariances among
variables and can be used to interpret patterns observed in the score plot.
Therefore, score and loading plots should be interpreted simultaneously.
For graphical purposes, the optimal number of PCs is two.
Unsupervised clustering methods can be hierarchical when successive
partition of the data set results in a clusters sequence represented as a tree
or dendrogram (Roggo et al., 2007). Non-hierarchical methods are
Gaussian mixture models, K-means, density based spatial clustering of
applications with noise, Kohonen neural networks, etc. (Lopes et al.,
2004; Roggo et al., 2007).
￿
￿
￿
Supervised classifi cation methods
The supervised classifi cation methods most often used are correlation
based methods, distance based methods, linear discriminant analysis
(LDA), soft independent modeling of class analogy (SIMCA), and
PLS discriminant analysis (PLS-DA) (Roggo et al., 2007). Some of
the methods are more focused on discrimination between samples (LDA),
whereas others are concerned with their similarity (SIMCA). Also,
besides linear methods, non-linear classifi cation methods such as neural
networks can be used.
Correlation and distance based methods cluster data by measuring
their (dis)similarity. Similarity of samples can be expressed by the
 
Search WWH ::




Custom Search