Geoscience Reference
In-Depth Information
big data streams. With approximately 600 billion transactions per day, various mobile devices are
creating approximately 1 PB (10 15 B) of data per year globally. Personal location data alone is a $100
billion business for service providers and $700 billion to end users (Manyika et al., 2011). The other
four streams of big data identified by the McKinsey Global Institute - health care, public-sector
administration, retail and manufacturing - also have a significant amount of data either geocoded
or geotagged. So geospatial data are not only an important component of big data but are actually,
to a large extent, big data themselves. For the geospatial community, big data presents not only
bigger opportunities for the business community (Francica, 2011) but also new challenges for the
scientific and scholarly communities to conduct ground-breaking studies related to people (at both
individual and collective levels) and environment (from local to global scale) (Hayes, 2012).
In fact, the geospatial community was tackling big data issues even before big data became a
buzz word or trend (Miller, 2010). From very early on, geospatial technologies were at the forefront
of big data challenges, primarily due to the large volumes of raster (remote-sensing imagery) and
vector (detailed property surveys) data that need to be stored and managed. Back in 1997 when
Microsoft Research initiated a pilot project to demonstrate database scalability, they used aerial
imagery as the primary data (Ball, 2011). The Microsoft TerraServer developed then is still in use
and functional today and sets the standard and protocol for today's other remote-sensing image
serving sites such as OpenTopography.org (LiDAR data).
The three Vs in spatial big data have raised daunting technical challenges for GC. First, as early
as 2007, our capacity to produce data had outpaced our abilities to store them (National Research
Council, 2009). Although DNA-inspired data-encoding techniques are promising (Hotz, 2012), the
lag time of practical implementation is still considerable. How to redesign our cyberinfrastructure
to better deal with the situation is thus becoming a major challenge. Second, the quality of spatial
big data is often problematic as they often have no sample scheme, no quality control, no metadata
and no generalisability (Goodchild, 2012). Third, both analysis and synthesis of the great variety of
data generated by ubicomp are currently difficult due to the lack of interoperability, common ontol-
ogy or semantic compatibility (Sui, 2012).
16.4.2 P roBleMS and c hallengeS of u BicoMP and S Patial B ig d ata
In addition to these technical challenges, ubicomp and spatial big data have also raised a series of
critical issues at the individual, social and environmental levels. While the potential applications
and benefits of ubicomp are well documented, concerns over ubicomp's long-term implications for
privacy, the digital divide and sustainability also need to be addressed alongside the technical chal-
lenges outlined previously.
At the individual level, ubicomp has intensified society's concern over privacy as potentially
troubling apps such as Girls Around Me (http://girlsaround.me) can be downloaded for free from
iTunes, or condom use can be mapped using precise lat/long coordinates (http://wheredidyou-
wearit.com). Location-based services and social media have further exacerbated concerns over
people's locational privacy. Resolving issues related to locational privacy requires comprehensive
approaches along legal, ethical and technical fronts (Sui, 2011). In particular, the development of
trustworthy geospatial technologies in the context of ubicomp deserves attention. Generally, two
major strategies have been developed and adopted: anonymity and obfuscation (Duckham and
Kulik, 2006). Anonymity is often regarded as one of the privacy-sympathetic technologies (PST).
Anonymity detaches or removes an individual's locational information from electronic transactions.
Anonymity is normally quite effective in protecting individual privacy. However, with recent
advances in data-mining techniques, GC can integrate locational information with other data such
as remotely sensed imagery; georeferenced social, economic and cadastral data; point-of-sale data;
credit-card transactions; traffic monitoring and video surveillance imagery; and other geosensor
network data, allowing identity to be inferred. Obfuscation techniques deliberately degrade loca-
tional information, using error and uncertainty to protect privacy, one of several privacy-enhancing
Search WWH ::




Custom Search