Information Technology Reference
In-Depth Information
Conventional spectrum analysis methods are based on physical models,
which assess correlations between the proportion of an element and the in-
tegral around the peaks that correspond to certain lines of the element to
be analyzed. In that case, the physics is relatively complex: overlapping of
peaks, spurious effects or measurement noise. The method is based on a local
analysis of the phenomena. Concentrations are estimated using computations
on data from the spectrum in the vicinity of the lines.
The CCA approach is different. It is based on a global analysis of the curve.
The spectrum is viewed as part of a space with 4096 components. In that
R
4096 space, the actual dimension of the distribution surfaces of the spectrum
points is equal to two: the spectra depend on 2 parameters only, namely, the
uranium and thorium concentrations. A reduction in size of
2 was
found to be suitable for the problem: the information “lost” by projection is
not a discriminating factor for the measurement of concentrations.
The database contains 60 spectra. Each spectrum has 4096 components.
The dimension of the matrix of the data sample is thus 60 × 4096. Reduction
by CCA therefore consists in transforming that sample into a matrix of 60
4096
to
R
R
2.
Figure 3.10 shows all the examples in the space reduced to 2 dimensions.
We have deliberately meshed the representation by showing the spatial topol-
ogy of the quantification performed by the investigators on the values of con-
centrations of uranium and thorium.
The projection obtained by CCA has the same topology as the experimen-
tal quantification. The concentrations of uranium and thorium were quantified
on the Cartesian product [( u 1 ,u 2 ,...,u 6 )
×
( t 1 ,t 2 ,...,t 10 )]. Actually, closer
inspection shows that a test is missing: the base only contained 59 spectra.
Figure 3.10 shows the data missing in the CCA projection.
The example shows the advantages of CCA: despite the nonlinear combi-
nations of several effects on the spectra, dimensionality reduction allowed us
to display the inherent size of the data, that of the variation in relation to the
concentration of thorium and uranium. Using reduced spectra, the estimation
of concentrations in uranium and in thorium becomes: regression with a small
neural network, or even simple linear interpolation is more than su cient.
Applied to more complex problems, when inherent size is not that obvi-
ous, one may proceed iteratively by increasing, if necessary, the number of
components of the projection space, whilst monitoring the preservation of the
local topology on the bisector for short distances.
×
3.6 The Bootstrap and Neural Networks
The final section describes a new approach that allows automatic design and
training of neural networks. It is based on the statistical bootstrap method and
on the early stopping technique (the latter technique is described in Chap. 2).
The approach advocated here consists in starting the design of the model
Search WWH ::




Custom Search