Geoscience Reference
In-Depth Information
initial assumption is that data is available without any concrete notion of inherent
structure (cf. process of KDD Miller and Han 2009 ,p.4).
In many instances when knowledge discovery is applied to spatial planning data,
a clustering is more or less the final result of analysis, intended to answer a specific
research question (European Spatial Planning Observation Network 2011 ;Aumayr
2007 ; Blume and Sack 2010 ;Demsar 2009 ; Kronthaler 2005 ; Hietel et al. 2004 ;
Rasuletal. 2004 ; Thompson et al. 2002 ;Qu 2000 ). The objects of interest can be
regions, municipalities, settlement blocks, or raster cells, described by a number
of attributes and gathered into uniform clusters. A small number of clusters are
extracted from data sets that may contain thousands of individual data objects. The
aim is to identify important shared features of the target objects from such huge
pools of data in order to provide concrete findings to assist in planning decisions. In
the most recent approaches, clusters are described by measures of central tendency,
variability, or discriminant analysis (Geyler et al. 2008 ;Frenkel 2004 ; Bätzing and
Dickhörner 2001 ; Siedentop et al. 2003 ).
Here we propose to go a step further. In the presented approach, clustering is
merely the starting point for the actual generation of knowledge. Useful clusters
are ones that help spatial planners, politicians, and decision-makers in their actions.
Therefore, the question “What do the clusters mean?” is addressed using several
different approaches involving interaction with a human expert. A special class
of classifier generation algorithm from machine learning is applied, with the aim
of producing human-understandable characterizations of the classes in the form of
decision rules (Alpaydin 2008 ; Hastie et al. 2009 ; Izenman 2008 ; Kuncheva 2004 ).
It should be emphasized that this technique can generate knowledge by investigating
the variables previously used for the clustering partitions (intrinsic explanation) or
by exploring other variables (extrinsic explanation).
This chapter is structured as follows: Section 3.2 introduces the sample spatial
planning data set to provide the basic framework. In Sect. 3.3 , the individual steps of
the knowledge discovery approach are explained. Finally, section “Conclusions and
Future Challenges” concludes with some remarks and addresses future challenges.
3.2
Sample Spatial Data Set
Our approach to knowledge discovery for spatial planning data is demonstrated
on a data set describing land use in 111 urban districts (UDs) as a subset of
all German districts (Kreise) (n D 412). The land use in each UD is specified
by seven variables, measured in the year 2010. The data is compiled from the
Monitor of Settlement and Open Space Development (Krüger et al. 2013 ). This
is a scientific service operated by the Leibniz Institute of Ecological Urban and
Regional Development to provide information on land use trends in Germany (IOER
Monitor, http://www.ioer-monitor.de ) . The advantage of data supplied by the IOER
Monitor is the provided classification featuring a wide range of variables (see Land
Use Classification) and the explicit spatial reference of the land use categories.
Search WWH ::




Custom Search