Information Technology Reference
In-Depth Information
The most important feature of self-organizing maps is the possibility of
comparing clusters, which summarize the data. Each observation is allocated
to a cluster. Each cluster is projected onto a node of the map. The compari-
son of projections stemming from different observations allows estimating the
proximity between their respective clusters: similar observations are projected
onto the same node. Otherwise, the dissimilarity increases with the distance
that separates the two projections; that distance is computed on the map.
Thus, the cluster space is identified to the map, so that projection enables
visualizing simultaneously the cluster space and the observation space.
Unsupervised classifiers and self-organizing maps are closely related; most
such methods of clustering aim at aggregating similar data. In that context,
similar means close with regard to the application field and the underlying
metric. The topological ordering is the specific contribution of neural networks
with unsupervised learning to clustering, a key theme in data analysis [Duda
et al. 1973; Jain et al. 1988].
In current decision systems, any clustering may contribute to supervised
classification as well. Most applications that use self-organizing maps are clas-
sifiers. Moreover, some of them are perform regression. Several explanations
help to understand that fact:
Straightforward modifications of the basic algorithm allow its use as a
supervised training algorithm [Cerkassky et al. 1991].
Results of unsupervised training algorithms may easily be integrated into
data processing systems that touch the same areas of interest as multilayer
Perceptrons. Therefore, self-organizing maps are used to pre-process data:
information provided by self-organizing maps may be processed by other
algorithms for regression or classification.
Actually, clustering or unsupervised classification turns out to be complemen-
tary to discrimination or supervised classification (as described in Chap. 6
of this topic). It can be considered in a sense, that any application project
uses supervised information to some extent. Any system needs to be vali-
dated before use: therefore, available expert knowledge must be used, since
an expert has processed some available data so that the associated desired
response is known and may be used to tune the automatic system. In partic-
ular, this knowledge may be used to improve unsupervised models. If expert
knowledge is widely available, then it is possible to take advantage of it from
the beginning of the analysis, using supervised forms of self-organizing maps.
Conversely, if it is scarce, it can be only used to interpret results of the un-
supervised analysis: expert knowledge will be used after achieving clustering
tasks. Thus, the approach is sequential: first, a partition of the data set is
sought; the recognition itself is subsequently performed.
Self-organizing maps and their theoretical foundations are presented in this
chapter. Those algorithms are described under a unified formalism, in order to
connect them with data analysis methods from which they actually stemmed:
self-organizing map algorithms may be viewed as extensions of well-known
Search WWH ::




Custom Search