Information Technology Reference
In-Depth Information
The maximal number of iterations N iter is chosen.
2. Iteration t : W t− 1 and σ t− 1 were computed at the previous iteration.
Minimization phase : computation of the new parameters W t and σ t ;
Allocation phase : update of the allocation function χ t that is associated
to W t and σ t from relation χ ( z ) =argmax c 2 p c 2 ( z ).
2. Iterate until t>N iter or until stabilization of the cost function E ( W,σ,χ ).
As for SOM training, PRSOM uses a neighborhood whose size is controlled by
the temperature parameter T . During training, the size of the neighborhood
decreases according to the cooling schedule. At the end of training, the map
provides an organized structure of the average vector set, and the partition
associated to the map is defined by the final allocation function χ N iter .Asfor
other versions of SOM, the data space D is divided into M subsets: each neu-
ron c of the map represents a data subset P c =
z N iter
( z )
. That map and
that partition were determined from probability distributions, whereas SOM
just uses Euclidean distances. The probability density estimation gives access
to additional information that may be useful for application purposes. Actu-
ally, that information is crucial as far as classification problems are concerned.
No stochastic version of PRSOM is available: a large sample of the data is
necessary to estimate the initial variance before updating the parameters.
PRSOM provides a lot of additional information about the training data
(tracking outliers, computing probabilities, etc.). However, that model can be
used only if the training observation set is large enough to allow an accurate
estimation of the standard deviations of the Gaussian modes of the mixture
in the initialization phase. Remote sensing, where a tremendous amount of
data is available, is ideally suited to applications of SOM. The detection of
ocean color is described in the next section.
{
= c
}
7.4 Classification and Topological Maps
Among the various applications of SOM, many of them are classification
tasks. As stated above, classification is not a straightforward application of
self-organization: unsupervised learning provides an allocation function that
assigns any observation to a cluster of a partition of the training set, irrespec-
tive of the semantics of the data. In such problems, it is assumed that a lot of
noise-corrupted observations are available with not knowledge of their class.
The partition that is obtained depends on the probability density underlying
the training set. Regions that contain a high density of data will be covered
by a fine partition; low-density regions will be covered by a coarse partition.
The large amount of data available in high-density regions provides accurate
information on those regions. On the other hand, the geometry of the parti-
tion depends on the nature of the encoding of the observations. Thus, for a
given problem and a given data set, several different encodings may generate
several partitions of the data space. With the SOM algorithm, the selection
Search WWH ::




Custom Search