Information Technology Reference
In-Depth Information
Fig. 5.5 Reconstruction of a
spectrum from a signal
acquired in channel 4 of a
piece with a XZ plane crack
defect
16
14
12
original
10
8
6
4
reconstructed
2
0
residual
-2
0
1
2
3
4
5
x 10 4
Frequency
reconstructed data). The reconstructed spectrum was obtained using the first 20
principal components (explained variance 95 %). It can be seen that the
reconstructed data is smoother than the original data and the residual data do not
contain significant features.
One problem that arises from the reduction of dimensionality is the possible
loss in the reconstructed data of discriminator weak features that are present in a
small portion of the data. Those features exhibit low correlation with the rest of the
data, and PCA will allocate them to the least significant components. However,
those rare features could be spurious due to instrumental issues, and so on, and
they have to be filtered before the classification. The impact-echo data of this
application did not show those kinds of features, as Fig. 5.5 shows.
In order to obtain a mass spectra version for the multichannel setup of the
impact-echo experiments, the vectors of reduced dimensionality V ð PCA Þ
i i ¼ 1...N
of each experiment were arranged into a single vector V ð PCA Þ with dimensionality
increasing by a factor of N, i.e., dimensionality = 7x20= 140 (see Equ. ( 5.13 )).
PCA was applied to V ð PCA Þ to obtain a version of reduced dimensionality V ðð PCA ÞÞ
of the data vectors in Eq. ( 5.14 ). The number of components retained was esti-
mated at 50, reducing the dimensionality from 140 to 50, obtaining a reconstructed
variance 92 % of the variance of the V ð PCA Þ data vectors. Thus, the final
dimensions of the input data matrices for the classification stage were
(2,100 9 50) and (2,030 x 50) for simulations and experiments, respectively. A
rationale of the PCA application and selection of a certain number of components
is included in [ 26 ].
Figure 5.6 shows an outline of the steps followed above in order to obtain the
data vectors for classification and the steps of the classification stage to obtain the
results. The dimensionality of the data vectors was 50, corresponding to a com-
pressed mass spectra data set from the multichannel impact-echo simulations and
experiments. The classifiers employed in the classification stage were: LDA, MLP,
 
Search WWH ::




Custom Search