Database Reference
In-Depth Information
center and the record's input values. This portion, and thus the magnitude of
the change in the weights, is determined by a change or learning rate parameter
referred to as eta. Typically, the first phase has a relatively large eta to learn the
overall data structure and the second phase incorporates a smaller eta to fine-tune
the cluster centers.
Although quite similar, Kohonen networks and K-means also have significant
differences. First of all, clusters in Kohonen networks are spatially arranged in a
grid map. Moreover, the ''winning'' of records by a neuron/cluster also affects the
weights of the surrounding neurons. Output neurons symmetrically around the
''winning'' neuron comprise a ''neighborhood'' of nearby units. Record assignment
adjusts the weights of all neighboring neurons. Because of this neighborhood
adaptation, the topology of the output map has a practical meaning, with similar
clusters appearing close together as nearby neurons.
Output units with no winning records are removed from the solution. The
retained output units represent the probable clusters. Users can specify the
topology of the solution, that is, the maximum width and length dimensions of the
output grid map. Selecting the right number of rows and columns for the output
map requires trial and error.
Analysts should also evaluate the geometry/similarity and the density/
frequency of the proposed clusters. Kohonen networks involve many iterations
and weight adjustments and consequently they are considerably slower than
the TwoStep and K-means. Nevertheless, they are worth trying as a clustering
alternative, especially because of the geometrical representation of the cluster
similarity that they provide.
In Kohonen network models, cluster assignment is represented by two
generated fields which denote the grid map co-ordinates (for instance, X
=
1,
Y
3) of each record. These two fields should normally be concatenated into a
single cluster membership field. A common and useful graphical representation
of the geometry of the derived solution is through a simple scatterplot, with
all records placed in the two-dimensional space defined by the grid co-ordinate
fields. A scatterplot like that for a nine-cluster (3
=
3) solution is presented
in Figure 3.10, depicting the values of the cluster membership field. This plot
visually represents the density and the relative position, and hence similarity, of
the resulting clusters.
×
Recommended Kohonen Network/SOM Options
Figures 3.11 and 3.12 and Table 3.11 explain the settings of the IBM SPSS
Modeler Kohonen network/SOMmodel and provide suggestions for fine tuning of
the algorithm.
Search WWH ::




Custom Search