Geography Reference
In-Depth Information
subject, exist (Deboeck and Kohonen, 1998; Oja and Kaski, 1999; Kohonen, 1982, 1990,
2001), so in this chapter the method is introduced in only the briefest terms. A SOM is
an artificial neural network in which neurons are arranged as a low-dimensional, typi-
cally two-dimensional, lattice such that each neuron has either four or six neighbours (i.e.
square or hexagonal neighbourhood). Each neuron is associated with an n -dimensional
vector of weights. Input data presented to the neuron lattice are of the same dimension-
ality. For example, 30 population attributes associated with 200 000 census enumeration
units would correspond to a training data set consisting of 200 000 thirty-dimensional
vectors.
During training, one input vector at a time is presented to all the neurons and the most
similar neuron is determined, typically based on a Euclidean similarity coefficient. The n
weights of that so-called best-matching unit (BMU) then get adjusted towards an even
better match. More important - and essential for the self-organizing nature of a SOM -
is that weights of neighbouring neurons around the BMU are likewise adjusted, up to a
certain neighbourhood size and with a diminishing magnitude best described as distance
decay. Over the course of many such training runs, the low-dimensional lattice of neuron
vectors begins to replicate major topological structures existing in the n -dimensional input
space.
A trained two-dimensional SOM can itself be visualized in various forms, including the
display of weights for a particular variable as colour shading across the neuron lattice.
This is also known as component plane display and an example is included later in this
chapter. One could also opt for a display based on multi-dimensional computation, such
as clustering of neuron vectors using hierarchical or k -means clustering (Skupin, 2004). A
very popular choice has been to visualize n -dimensional differences among neighbour-
ing neurons using the so-called U-matrix method (Ultsch, 1993). Finally, the original
input vectors or other vectors, if they contain the same variables and underwent identi-
cal preprocessing, could also be visualized. This involves finding for each the BMU from
among the trained neurons and placing point symbols and text labels at that respective
BMU's location.
Further computational and visual transformations may be desired, but existing SOM
software is in fact severely limited in that respect. The vast majority of examples of SOM -
at least when used for visualization purposes - make a choice among a limited number of
available SOM software solutions. Extremely popular has been the SOM software created
by the Neural Networks Research Centre at the Helsinki University of Technology. One
important reason for this popularity is that the software is freely available, including access
to the source code. SOM PAK (Kohonen et al. , 1996a) is a collection of programs written in
C, which can be compiled for different platforms, although Windows executables are also
available. It implements the standard SOM training algorithm and was used for all examples
presented in this chapter. Its visualization functionality is, however, rudimentary. This was
a major reason for our implementation of GIS-based storage and visualization of a trained
SOM. From the same source as SOM PAK comes the equally free SOM Toolbox for Matlab
(although it requires Matlab to already be installed), which includes various visualization
options. However, compared with graphic design or GIS software, it is much harder to allow
a user's imagination drive the control and transformation of these visualizations. That is
why the majority of visual examples of SOM Toolbox applications found in the literature
have a fairly uniform appearance. That is also the case for most commercial SOM software,
like Viscovery SOMine (www.eudaptics.de).
Search WWH ::




Custom Search