Geoscience Reference
In-Depth Information
The increasingly complex, urbanised and connected nature of human settlement is driving a
demand for better contextual information to inform decisions about the needs and preferences of
people and the places in which they live and work. Decennial censuses of population (e.g. in the
United Kingdom) have in the past been appropriate for this task, but there are increasing numbers of
calls to supplement census sources with data that are more timely and relevant to particular applica-
tions. Bespoke classifications aim to meet this need and differ from general-purpose classifications
by being built for a specific domain of use (e.g. health or education). Better and more intelligent
integration of a wider range of available data sources can open new horizons for depicting salient
characteristics of populations and their behaviours. The art and science of creating geodemographic
classifications has always been about much more than computational data reduction, and a key con-
sideration in this quest is the availability of decision support tools to present areal data from a range
of attributes in a format that is readily intelligible to the user. Thus, for example, in devising a local
indicator of school attainment, it might be appropriate to use data sources that variously measure
demographic structure, school attainment and deprivation. In assembling such sources together,
the analyst should be made aware of issues of data collection, normalisation, weighting and data
reduction method.
The challenge to GC arises from the need to create geodemographic systems that are simultane-
ously more responsive and more open. There are a number of motivations behind this. First, current
classifications are created from static data sources that do not necessarily reflect the dynamics of
population change in modern cities. Data are increasingly available for frequent time intervals and
offer the potential to be integrated with other traditional sources to create more timely systems. For
example, travel data recording the flow of commuters across a city network could be used to estimate
daytime population characteristics. A further example might entail extracting frequently updated
patient registrations with doctors' surgeries in order to provide a more up-to-date picture of the
residential composition of neighbourhoods. A requirement for distributed and simple to use online
classification tools arises from changes in the supply of socio-economic data and the potential that
this creates for end users to create new intelligence on socio-spatial structures. In addition to census
data that have been collected every 10 years in the United Kingdom, numerous supplementary data
sources are becoming available, some of which are already updated in near real time. The availabil-
ity of such resources will increase the potential to create more responsive and application-specific
geodemographic classifications that will make it less acceptable to uncritically accept the outputs of
general-purpose classifications as received wisdom. A second motivation is that application-specific
classifications have been successfully demonstrated across a variety of domains, and there are many
more sectors that could potentially benefit if the methods of construction and interpretation were
more accessible and transparent.
We argue here that there is a need for GC web-based applications that enable the creation of
general-purpose geodemographic classifications on the fly, , and we anticipate that, when building
geodemographic classifications in the future, the full process of specification, estimation and test-
ing will be integrated in such an online tool. In these systems, the construction process should be
guided to fulfil the objectives of the problem under investigation and will enable parallels between
those aspects of society or the built environment that aim to be measured to be selected, and then
these matched to available absolute, surrogate or estimated data. The created measures will then
be standardised onto the same scale to enable comparison and grouping through cluster analysis.
The data comprising these measures will typically be organised in a database, the content of which
may have been drawn from disparate locations (possibly in real time), and include data related to
various time periods and spatial resolutions. The data could have been manually input, uploaded
as batch files or updated as direct calls to remote API of various open data sharing websites. The
main computational overhead in building such a geodemographic information system relates to
the performance of clustering algorithms when searching for patterns of similarity amongst high-
dimensional attribute data about places. The more zones and attributes the data matrix comprises,
the slower the computation will take to find an optimal solution.
Search WWH ::




Custom Search