Geoscience Reference
In-Depth Information
3.3.7.2
Generation of New Knowledge
As elaborated in the previous section, one initial expected result of clustering is
to rediscover previously known structures in the data. The experience from many
knowledge discovery tasks (Loetsch and Ultsch 2013 ; Behnisch and Ultsch 2009 ;
Moerchen et al. 2006 ; Kupas et al. 2004 ) is that about 80 % of clusters coincide with
known processes. Typically about 10 % may be attributed to erroneous data, while
the remaining 10 % may generate entirely new knowledge.
This latter situation can be sketched out in the case of Cluster UC5 in the U-
matrix clustering. The members of this cluster are Flensburg, Hamburg, Bremen,
Bremerhaven, Greifswald, Rostock, Stralsund, and Wismar. Bremen was found to be
the most representative object of this cluster (maximum value in the silhouettes). In
Fig. 3.14 , the members of Cluster UC5 are highlighted in yellow. It can be seen that
these cluster objects are all coastal urban districts. So a first observation in regard to
Cluster UC5 is that it represents a subset of Germany's coastal urban districts. Other
coastal urban districts are shown in different color.
Note that information on whether an urban district is located on the coast or is
a harbor city is not included in the variables. Some coastal urban districts are not
grouped in Cluster UC5, implying that this is more than just a collection of seaports.
The rules for the cluster can be examined in order to gain greater insight into the
particular meaning of the cluster. The rule describing Cluster UC5 is:
UD data belongs to Cluster UC5, if
OpenSpaceMeshSize 72:6255 and
SealedSurface 48:3179
This means that Cluster UC5 is the subset of the UD data with large values in
OpenSpaceMeshSize and large values in SealedSurface. This rule assigns all seven
districts correctly to the cluster. Other coastal urban districts are not included in
Cluster UC5. In the case of Kiel, for example, although the city possesses a fairly
high degree of sealed surface, the fragmentation of open space is substantially
higher than in the coastal urban districts of Cluster UC5. The larger the effective
mesh size in an urban district, the lower the landscape fragmentation. In our
case, the regional transport network of roads and railway lines was adopted as
a measure of landscape fragmentation. The procedure developed by Moser et al.
( 2007 ) was applied in order take account of target areas truncated by the borders
of administrative units (cf. Table 3.1 ). The interpretation of the cluster properties
leads to a hypothesis regarding UDs: there are two types of coastal urban districts
in Germany, those with high and those with low fragmentation of open space.
Thus, a potential meaning of Cluster UC5 is “coastal urban districts with less
landscape fragmentation due to linear transport infrastructure”. A statistically
testable hypothesis could be formulated as follows: the coastal urban districts in
Cluster UC5 are substantially different from all other such districts in Germany.
If this hypothesis cannot be refuted, then it may be worthwhile examining the
reasons behind these differences in landscape fragmentation. For example, one
can assume that several other properties will influence the characteristics of a
Search WWH ::




Custom Search