Geoscience Reference
In-Depth Information
17.1 INTRODUCTION
Embedded computing involves placing computer processors within common objects - such as
refrigerators and automobiles - in order to collect data, guide operations and offer user interactivity.
Embedded software is computer software that plays an integral role in the hardware it is supplied
with. Here, the idea of embedding is extended from beyond hardware, to published documents.
When publishing results of spatial data analysis, it could be argued that the software used to carry
out the analysis plays an integral role in the document being published. This is part and parcel of
ensuring that others can repeat your experiments and reproduce your results: two fundamental
aspects of good science.
The aforementioned practice is the fundamental concept of reproducible research (Claerbout,
1992): a term that has appeared in the scientific literature for nearly two decades (at the time of
writing this) and since then has gained attention in a number of fields such as statistics (Buckheit
and Donoho, 1995; Gentleman and Temple Lang, 2004), econometrics (Koenker, 1996) and digi-
tal signal processing (Barni et al., 2007). The idea is that when research is published, full details
of any results reported and the methods used to obtain these results should be made as widely
available as possible, so that others following the same methods can obtain the same results.
Clearly, this proposition is more practical in some areas of study than others - it would not be a
trivial task to reproduce the chain of events leading to samples of lunar rock being obtained, for
example. However, in the area of GeoComputation (GC), and particularly spatial data analysis,
it is an achievable goal. Indeed, failure to provide reproducible results could seriously damage
the credibility of GC research.
Recent events in an area closely related to GC, climate data analysis, have drawn attention to the
need - indeed the necessity - for reproducibility. Popularly known as Climategate (Chameides, 2010;
Closing the Climategate, 2010), a number of e-mails (around 1000) from the Climatic Research Unit
(CRU) of University of East Anglia were made public without authorisation. This led to a number
of challenges to the work of CRU, and more generally, some called into question the work of cli-
mate science in general and, in particular, that of the Intergovernmental Panel on Climate Change
(IPCC). In turn, this led to the university funding an independent review of the situation, chaired by
Sir Muir Russell. Although finding that the rigour and honesty of CRU were upheld and that there
was no evidence to undermine the findings and conclusions of IPCC, issues were raised relating to
the reproducibility of results of data analysis:
We believe that, at the point of publication, enough information should be available to reconstruct
the process of analysis. This may be a full description of algorithms and/or software programs where
appropriate. We note the action of NASA's Goddard Institute for Space Science in making the source
code used to generate the GISTEMP gridded dataset publically (sic) available. We also note the recom-
mendation of the US National Academy of Sciences in its report “Ensuring the Integrity, Accessibility
and Stewardship of Research Data in the Digital Age” [ author's note : published 2009] that: “ …the
default assumption should be that research data, methods (including the techniques, procedures and
tools that have been used to collect, generate or analyze data, such as models, computer code and
input data) and other information integral to a publically reported result will be publically accessible
when results are reported… ”. We commend this approach to CRU.
- Russell (2010)
Arguably, the lesson here is that it is difficult to sustain confidence in any data analysis, however
well done, if it cannot be recreated by a third party and if the steps taken in the analysis are not
open to scrutiny. Given this compelling line of argument, it is essential for GC to adopt reproducible
methodologies if it is to avoid the kind of controversies and doubts in its findings described earlier.
The continued funding and wider influence of any discipline hinges on its credibility, and for a dis-
cipline in its relatively early stages, such factors can decide whether it will burgeon in the future or
disappear as a footnote in the history of science. Given this need to adopt reproducible approaches,
Search WWH ::




Custom Search