Towards High-resolution Self-organizing Maps of Geographic Features - Geographic Visualization: Concepts, Tools and Applications

Geography Reference

In-Depth Information

because a 400-cluster solution for 51 objects is not very useful. Instead, the SOM allows

detailed two-dimensional layout of the geographic objects. Those states still assigned to a

single neuron (e.g. Louisiana (LA) and Mississippi (MS)) are too similar to be distinguishable

even at this level of granularity.

The detail provided in high-resolution SOMs makes it possible to have them play the role

of a base map onto which various other data could be mapped. This is particularly true due

to the fact that a SOM does not directly represent the input vectors as such, in contrast to

such methods as multidimensional scaling (MDS) or spring models. Instead it creates a low-

dimensional output model of the n -dimensional input space. That model can be applied to

other data, as long as they have the same dimensionality. Once those data are mapped onto

the SOM, other features can be attached. For example, if a SOM is constructed from multi-

temporal demographic attributes of geographic objects, one could link individual temporal

vertices to form trajectories and then visualize previously unrelated attributes onto those

trajectories (Skupin and Hagelman, 2005). From clustering to labelling of neuron regions, a

number of transformations have been proposed that all depend on a view of high-resolution

SOMs as base maps (Skupin, 2004).

Most SOM software solutions provide limited support for effectively storing and trans-

forming large neuron lattices and derived data, such as trajectories and surfaces. One al-

ternative is to leverage the ability of GIS to deal with large, low-dimensional geometric

data sets. Within GIS one can first choose among the various geometric data models, have

access to various database solutions and perform a wide array of transformations, from

interpolation to overlay operations. Finally, in the hands of a cartographer, GIS can produce

attractive visualizations with a large degree of automation (for example for the complex task

of feature labelling), while still performing data-driven visualization. Use of GIS can thus

make high-resolution SOMs a much less daunting proposition on many levels.

8.3.2 Examples of high-resolution SOMs

Most SOM implementations are based on lattices of no more than a few hundred neurons,

and typically much less than that. A few examples for large SOMs exist though. Most of these

were in fact created by the research group around the method's inventor, Teuvo Kohonen.

In the mid-1990s they mapped more than 130 000 newsgroup postings onto a SOM that

eventually consisted of 49 152 neurons, though in a two-stage process that began with a

much smaller SOM of 768 units, from which the larger SOM was interpolated and further

training was then applied (Kohonen et al. , 1996b). By far the largest SOM known was

created from the text of almost 7 million patent applications (Kohonen, 2001). Training was

a three-step process, during which progressively finer SOMs were created, beginning with

a 435-neuron SOM and eventually leading to a model consisting of 1 002 240 map units.

Training took 6 weeks on a six-processor computer system.

Training speed is not merely a function of the number of neurons, but also of the

model's dimensionality. Text documents tend to be represented with much longer vec-

tors than other data. The demographic data visualized in Figures 8.1-8.3 includes 32 at-

tributes, while Skupin's visualization of AAG conference abstracts represented each abstract

as a 741-dimensional vector (Skupin, 2002). At that time, training of a 4800-neuron SOM

with the conference abstracts took 3 hours. Training of a much higher-resolution, yet very

Geographic Visualization: Concepts, Tools and Applications

Search WWH ::

Custom Search

Home