Information Technology Reference
In-Depth Information
Weighted mean = Quality
SPR
RGF
PLF
SLF
1
2
3
4
Dimension of subspace
Figure 3.22 Plot of axis predictivity against dimensionality. Note that the dimensions
are cumulative; for example, the three-dimensional fits include the first and second
dimensions.
three dimensions. Of course the incommensurability of the variables is having a major
effect. To obtain the overall quality, the predictivities have to be weighted proportionately
to the variances of each variable. The variances are (5.5881, 0.4116, 0.0079 and 1.0533),
giving the weights (0.7914, 0.0583, 0.0011, 0.1492) required to reproduce the overall
cumulative qualities (0.7915 0.9677 0.9992 1.0000).
Perusal of Table 3.6 shows axis predictivity to be larger than axis adequacy. This
can be proved to be true in general (see Gardner-Lubbe et al., 2008), but is of academic
interest as adequacy and predictivity measure different things and are not comparable.
Adequacy may be of interest for comparing the approximation to V , as in factor analysis,
but not when concerned with approximating X .
We provide a function PCA.predictivities for calculating the above measures
of fit. PCA.predictivities takes a data matrix as first argument, centres the data and
optionally scales it to unit variances. It returns a list with components Quality , Weights ,
Adequacies , Sample.predictivities.original , Axis.predictivities
.original , Sample.predictivities.new , Axis.predictivities.new.1 ,and
Axis.predictivities.new.2 .
Except
for Weights ,
all
other
components
are
...
given for dimensions 1, 2,
, p . Tables 3.7 and 3.8 are examples of the output of
PCA.predictivities . Because of the very high predictivity of SPR in the first
dimension, together with relatively small predictivities of the other variables in that
dimension, the warning encountered in Section 2.5 regarding pre-scaling of the data
should be seriously considered here. We note from Table 3.8 that all samples except
Search WWH ::




Custom Search