Geoscience Reference
In-Depth Information
spectra (for the recursion start it would be one spectrum) to be selected in
the class. The following spectrum is added to the same class if the distance
between it and the selected class is less than a certain fixed value. After all
spectra are tested, the procedure of the classification will be returned for all
rested spectra, etc., till the whole totality will have been divided.
There is a sufficient amount of different functions in available topics (Du-
ran and Odell 1974; Kolmogorov and Fomin 1989) satisfying the axiomatic
demands to the metric introduced in the topic by Kolmogorov and Fomin
(1989). We should mention that the application of different metrics leads to
different results of the classification, so the metric is to be chosen based on the
concrete problem conditions. As has been mentioned in the topics by Duran
and Odell (1974), these conditions sometimes don't allow finding the mathe-
matically correct metrics. Thus, different heuristic metrics (Duran and Odell
1974) are used which are not metrics in the strict sense of the word. We have
had to follow the latter way and to use the function below as a measure of the
distance between spectra R (1) and R (2) :
R (1 i R (2)
2
|
|
i
ρ
( R (1) , R (2) )
=
max
i
,
(3.13)
s i ( R (1 i + R (2)
)
i
where s i is the relative random standard deviation of the measured SBC (con-
crete values are in the articles by Vasilyev A et al. 1997a, 1997b, 1997c). The
differences between spectra at every wavelength (not at all wavelengths in av-
erage) are accounted in (3.13) because the spectra difference even within the
narrow spectral region could turn out the essential one for classification. It
is especially important for revealing the erroneous spectra, as will be consid-
ered further. For transformation to the relative values, the difference of the
spectrum values in (3.13) is normalized to their mean arithmetic and to the
standard deviation to take into account the uncertainty variations over the
spectrum. As a distance between the spectrum and the class, the distance to
the starting spectrum of the class is used. The spectrum will be attributed to
theclassifthedistanceislessthen3,i.e.accordingtotheknownstatistical
rule “the allowed deviation from the average does not exceed three standard
deviations”.
The choice of the starting spectrum (also, the starting spectrum of every
following class recurrently) is the indefinite point of the cluster analysis. The
problem is that the spectrum is to correspond to the maximum of the distri-
bution of the multi-dimensional function (of the histogram) of the classified
objects totality and the search for the maximum is a complicated problem ei-
ther from the calculating or mathematical point of view. We have analyzed the
applicability of different approaches of the choice of starting spectrum to our
problem as per the topic by Duran and Odell (1974) and we finally decided in
favor of the following algorithm. Every spectrum of the classified totality has
been tested for the possibility of using it as a starting one. For this spectrum,
the number of the spectra of the same class is determined together with the
average distance to the spectra of this class expressed by (3.13) and the ratio of
the average distance to the number of the spectra of the same class. The latter
Search WWH ::




Custom Search