Geoscience Reference
In-Depth Information
this chapter will consider the implications of adopting reproducible research in the context of spatial
data analysis, some of the difficulties of doing this will be outlined and some steps towards achiev-
ing the goals of adoption will be proposed.
17.2 REPRODUCIBILITY IN RESEARCH
To some, the justification of reproducible research may be self-evident. It may already be seen as a
necessary condition for well-founded scientific research in GC. However, if a more concrete argu-
ment is required, perhaps the following scenarios could be considered:
1. You have a data set that you would like to analyse using the same technique as described in
a paper recently published by another researcher in your area. In that paper, the technique
is outlined in prose form, but no explicit algorithm is given. Although you have access to
the data used in the paper and have attempted to recreate the technique, you are unable to
reproduce the results reported there.
2. You published a paper 5 years ago in which an analytical technique was applied to a data
set. You now discover an alternative method of analysis and wish to compare the results.
3. A particular form of analysis was reported in a paper; subsequently, it was discovered that
one software package offered an implementation of this method that contained errors. You
wish to check whether this affects the findings in the paper.
4. A data set used in a reported analysis was subsequently found to contain rogue data and
has now been corrected. You wish to update the analysis with the newer version of the data.
Each of the aforementioned scenarios (and several others) describes situations that cannot be
resolved unless explicit details of data and computational methods used when the initial work was
carried out are available. A number of situations may arise in which this is not the case. Again, some
possibilities are listed:
1. You do not have access to the data used in the original analysis, as it is confidential.
2. The data used in the original study is not confidential, but is available for a fee, and you do
not already own it.
3. The data used in the original study is freely available, but the original study does not state
the source precisely or provide a copy.
4. The steps used in the computation are not explicitly stated.
5. The steps used in the computation are explicitly stated but require software that is not free
and that you do not already own.
6. The steps used in the computation are explicitly stated, but the software required is not
open source,* so that certain details of procedures carried out are not available.
17.2.1 a ddreSSing the P roBleMS
All of the situations given earlier stand in the way of reproducible research. In Situation 3, this state
of affairs is inevitable unless the researcher interested in reproducing the results obtains consent
to access the data. Situations 2 and 5 can be resolved by financial outlay if sufficient funds are
available, but Situations 6, 4 and 3 cannot be resolved in this way. For this last set of situations, it
is argued that resolution is achieved if the author(s) adopt certain practices at the time the research
is executed and reported (Barnes, 2010). Situation 6 is in some ways a variant of Situation 4 where
* The distinction is made here between open source software and zero cost software - which is obtained without fee, but
may not have openly available source code. For this argument to hold, source code must be available.
Search WWH ::




Custom Search